Data serialization in C?

2023-02-23 14:04 问答作者：

I have this structure which I want to write to a file:

typedef struct
{
    char* egg;
    unsigned long sausage;
    long bacon;
    double spam;
} order;

This file must be binary and must be readable by any machine that has a C99 compiler.

I looked at开发者_JAVA技巧 various approaches to this matter such as ASN.1, XDR, XML, ProtocolBuffers and many others, but none of them fit my requirements:

small
simple
written in C

I decided then to make my own data protocol. I could handle the following representations of integer types:

unsigned
signed in one's complement
signed in two's complement
signed in sign and magnitude

in a valid, simple and clean way (impressive, no?). However, the real types are being a pain now.

How should I read float and double from a byte stream? The standard says that bitwise operators (at least &, |, << and >>) are for integer types only, which left me without hope. The only way I could think was:

int sign;
int exponent;
unsigned long mantissa;

order my_order;

sign = read_sign();
exponent = read_exponent();
mantissa = read_mantissa();

my_order.spam = sign * mantissa * pow(10, exponent);

but that doesn't seem really efficient. I also could not find a description of the representation of double and float. How should one proceed before this?

If you want to be as portable as possible with floats you can use frexp and ldexp:

void WriteFloat (float number)
{
  int exponent;
  unsigned long mantissa;

  mantissa = (unsigned int) (INT_MAX * frexp(number, &exponent);

  WriteInt (exponent);
  WriteUnsigned (mantissa);
}

float ReadFloat ()
{
  int exponent = ReadInt();
  unsigned long mantissa = ReadUnsigned();

  float value = (float)mantissa / INT_MAX;

  return ldexp (value, exponent);
}

The Idea behind this is, that ldexp, frexp and INT_MAX are standard C. Also the precision of an unsigned long is usually at least as high as the width of the mantissa (no guarantee, but it is a valid assumption and I don't know a single architecture that is different here).

Therefore the conversion works without precision loss. The division/multiplication with INT_MAX may loose a bit of precision during conversion, but that's a compromise one can live with.

If you are using C99 you can output real numbers in portable hex using %a.

If you are using IEEE-754 why not access the float or double as a unsigned short or unsigned long and save the floating point data as a series of bytes, then re-convert the "specialized" unsigned short or unsigned long back to a float or double on the other side of the transmission ... the bit-data would be preserved, so you should end-up with the same floating point number after transmission.

This answer uses Nils Pipenbrinck's method but I have changed a few details that I think help to ensure real C99 portability. This solution lives in an imaginary context where encode_int64 and encode_int32 etc already exist.

#include <stdint.h>     
#include <math.h>                                                         

#define PORTABLE_INTLEAST64_MAX ((int_least64_t)9223372036854775807) /* 2^63-1*/             

/* NOTE: +-inf and nan not handled. quickest solution                            
 * is to encode 0 for !isfinite(val) */                                          
void encode_double(struct encoder *rec, double val) {                            
    int exp = 0;                                                                 
    double norm = frexp(val, &exp);                                              
    int_least64_t scale = norm*PORTABLE_INTLEAST64_MAX;                          
    encode_int64(rec, scale);                                                    
    encode_int32(rec, exp);                                                      
}                                                                                

void decode_double(struct encoder *rec, double *val) {                           
    int_least64_t scale = 0;                                                     
    int_least32_t exp = 0;                                                       
    decode_int64(rec, &scale);                                                   
    decode_int32(rec, &exp);                                                     
    *val = ldexp((double)scale/PORTABLE_INTLEAST64_MAX, exp);                    
}

This is still not a real solution, inf and nan can not be encoded. Also notice that both parts of the double carry sign bits.

int_least64_t is guaranteed by the standard (int64_t is not), and we use the least perimissible maximum for this type to scale the double. The encoding routines accept int_least64_t but will have to reject input that is larger than 64 bits for portability, the same for the 32 bit case.

The C standard doesn't define a representation for floating point types. Your best bet would be to convert them to IEEE-754 format and store them that way. Portability of binary serialization of double/float type in C++ may help you there.

Note that the C standard also doesn't specify a format for integers. While most computers you're likely to encounter will use a normal two's-complement representation with only endianness to be concerned about, it's also possible they would use a one's-complement or sign-magnitude representation, and both signed and unsigned ints may contain padding bits that don't contribute to the value.

继续阅读：c floating-point serialization

Data serialization in C?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？