开发者

C/C++ getting struct size

Today, with my great surprise, I discovered that

When the sizeof operator is applied to a class, struct, or union type, the result is the number of bytes in an object of that type, plus any padding added to align members on word boundaries. The result does not necessarily correspond to the size calculated by adding the storage requirements of the individual members.

I didn't know of it, and am pretty sure this thing is breaking some of my old code: to read binary files I used to have structs like this one:

struct Header
{
    union {
        char identc[4];
        uint32 ident;
    };
    uint16 version;
};

and to read those 6 bytes directly with fread driven by sizeof:

fread( &header, sizeof(header), 1, f );

But now sizeof(header) returns 8!


Is it possible that with older G开发者_开发问答CC versions sizeof(header) returned 6, or I my mind is totally gone?

Anyway is there any other operator (or preprocessor directive or whatever) that lets the compiler know how big the structs is -- excluding padding?

Otherwise what would be a clean way to read a raw-data struct from a file that doesn't require to write too much code?


EDIT: I know that this isn't the correct way to read/write binary data: I'd have different result depending on machine endianess and stuff. Anyway this method is the fastest one, I'm juist trying to read some binary data to quickly get its content, not to write a good application which I'm going to use in future or to release.


What you want is the #pragma pack command. This allows you to set the packing to any amount you want. Typically you would set the packing value to 1 (or is it 0? ) before your structure definition and then return it to the default value after the definition.

Note that this does not do anything to guarantee portability between systems.

See also: use-of-pragma-in-c and various other questions on SO


Yes the code you presented isn't portable. Not only structure sizes but also byte orders might differ.


This is not the correct way to process binary files. Aside from alignment issues, it also has endian issues. The proper way to read binary files is with an array of uint8_t (or unsigned char, it really doesn't matter) and your own functions to built an in-memory representation out of the data.


Most compiles provide for a specific extension that allows you to control the packing of structs. This should allow you to control it. However, when you write the struct in binary, you should be able to just write it and read it regardless of packing, as when you write the struct, it should also write sizeof(struct) bytes. The only case where this would be a trouble is if you wanted to read files created with the previous versions. Also, you need to consider byte-order issues, etc.


Your question is compiler specific, but generally if you build your structure such that each member lies on a boundary of the same size as itself (four byte elements on boundaries divisible by four, etc.), you'll get the behavior you want. Watch also for cases like the one you presented where padding comes at the end of a structure to align the start of the first element of the next structure--if they were laid out in an array.


It seems that you havn'tactually asked a question so I'm not sure why I am even trying to answer! But yes, packing is important and will change depending on compiler versions, flags, target architecture pragmas, wind direction, phases of the moon and potentially many other things. Dumping binary to a file (or socket) is not a very good way of serializing anything.


This extra padding is necessary to get the members aligned properly when you create an array of these structures. Without it, the 2nd element of the array would have the ident member aligned on an address that's not a multiple of 4.

It is probably too late to do anything about it, you probably wrote files with this structure before. Changing the packing will make these files unreadable. But, yes, having file data that's dependent on compiler settings isn't the greatest idea. Having data stored in a human-readable format is common these days. Neither the disk bytes nor the CPU cycles are worth it.


Yes, the alignment problem. That is why internet protocol messages have aligned structs so that this problem can be avoided when sending data over the network.

What you can do is either fix your structs so that they are aligned properly, or have marshalling functions that you use when saving and retrieving data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜