开发者

strange object serialization problem in file parsing

I have a strange problem with object serialization. in the file documentation it states as following

The lead in starts with a 4-byte tag that identifies a TDMS segment ("TDSm"). The next four bytes are used as a bit mask in order to indicate what kind of data the segment contains. This bit mask is referred to as ToC (Table of Contents). Any combination of the following flags can be encoded in the ToC: The next four bytes contain a version number (32-bit unsigned integer), which specifies the oldest TDMS revision a segment complies with. At the time of this writing, the version number is 4713. The only previous version of TDMS has number 4712. The next eight bytes (64-bit unsigned integer) describe the length of the remaining segment (overall length of the segment minus length of the lead in). If further segments are appended to the file, this number can be used to locate the starting point of the following segment. If an application encountered a severe problem while writing to a TDMS file (crash, power outage), all bytes of this integer can be 0xFF. This can only happen to the last segment in a file. The last eight bytes (64-bit unsigned integer) describe the overall length of the meta information in the segment. This information is used for random access to the raw data. If the segment contains no meta data at all (properties, index information, object list), this value will be 0.

so i implemented as

class TDMsLEADIN {
public:
    char   Signature[4];    //TDSm
    __int32     Toc;
    unsigned __int32     vernum;
    unsigned __int64  nextSegmentOff;
    unsigned __int64  rawDataOff;
};
fread(&leadin,sizeof(TDMsLEADIN),1,f);

then i got signature="TDsm", TOc=6, vernum=4712 as expected. nextSegmentOff=833223655424, rawDataOff=8589934592 but expected both of nextSegmentOff and rawDataOff=194

then i break the class into two parts, and read two two parts seperately

class TDMsLEADIN {
public:
    char   Signature[4];    //TDSm
    __int32     Toc;
    unsigned __int32     vernum;

};
class TDMsLeadINend{
public:
    unsigned __int64 nextSegmentOff;
    unsigned __int64 rawDataOff;
};
    fread(&leadin,sizeof(TDMsLEADIN),1,f);
    fread(&leadin2,sizeof(TDMsLeadINend),1,f);

then i got nextSegmentOff ,rawDa开发者_如何学JAVAtaOff as expected=194. my question is what is wrong with the original code? why it works when i break it into two parts? i tried unsigned long long instead of unsigned __int64, but still the same result. it is quite strange.

Thanks


You seem to be just reading and writing the binary data in the struct directly.

Generally the compiler will align structure data for performance, so when it's a single struct there's a hidden 32-bit pad between vernum and nextSegmentOff to align nextSegmentOff. When it's split into two structures there's no such extra padding and you're reading four bytes of padding and four bytes of real data into nextSegmentOff.

You can test this by comparing the sizeof(TDMsLEADIN [second version]) + sizeof(TDMsLeadINend) to sizeof(TDMsLEADIN [first version])

The standard way to serialize data is to serialize each underlying piece individually rather than relying on the layout of a class or structure as that can change by compiler without notice.


Your problem is that your compiler hasn't packed the struct so all the members are next to each other. For example, your compiler may have well decided that it likes your 64-bit variables to be 64-bit aligned in memory, and inserted a 4-byte padding in your struct to do this.

If you really need the I/O performance that this provides, you can usually tell the compiler to pack the struct, but 1) performance may suffer when you use nonaligned elements in the struct, and 2) your code will be nonportable, since different compilers specify this in different ways. See Visual C++ equivalent of GCC's __attribute__ ((__packed__)) for a quick summary of ways to do this on different compilers.

The portable but somewhat more prosaic way is:

fread(&lead.Signature, 4, 1, f);
fread(&lead.Toc, sizeof(__int32), 1, f);
...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜