开发者

Packing Floats into a long long

I would like to pack 2 floats into an long long. What wo开发者_开发百科uld be the correct way to do this?


Slightly depends how portable you want it to be, in two senses: what platforms it works on at all, and what platforms it gives the same answer on. But something like this.

#include <stdint.h>

long long pack(float lo, float hi) {
    assert(sizeof(float) == 4);
    assert(sizeof(long long) == 8);
    assert(CHAR_BIT == 8);

    uint32_t target;
    memcpy(&target, &lo, sizeof(target));
    long long result = target;
    memcpy(&target, &hi, sizeof(target));
    result += ((long long)target) << 32;
    return result;
}

The "other" way to get the bits of a float as an integer in one write+read is with a so-called "union cast", but I prefer memcpy. You can also access the float one unsigned char at a time and build up the long long with 8-bit shifts.


You can't do this. A float is 32 bits wide, so only one float fits in a uint32_t.

To your edited question, depending on how [over]zealous your compiler is about strict aliasing optimizations, you may be able to do something like this. Be sure to test however, because compilers love to break this sort of thing. A safer implementation would probably use memcpy.

#include <stdint.h>

union converter { uint32_t i; float f; };

uint64_t pack(float a, float b) {
    union converter ca = { .f = a };
    union converter cb = { .f = b };
    return ((uint64_t)cb.i << 32) + ca.i;
}

void unpack(uint64_t packed, float *a, float *b) {
    union converter ca = { .i = packed };
    union converter cb = { .i = packed >> 32 };
    *a = ca.f;
    *b = cb.f;
}

Note: use [u]int64_t, not long long; there's no guarantee that long long is exactly 64 bits.


Assuming your floats are represented using 32-bits, in general you will not be able to do this without losing data.

If you absolutely have to do this you may want to convert them each to a 16-bit float (aka "half") http://en.wikipedia.org/wiki/Half_precision_floating-point_format

The wikipedia page includes a link (at the very bottom) which provides info on how to do the conversion in C.

Once you have the 'halves' you can then bit mask & shift them into your 32-bit integer.

Update , Now that question was changed to 'long long' instead of uint_32:

The memcpy solution has been mentioned, here is an example using a union:

#include <stdio.h>
#include <stdint.h>

typedef union
{
    uint64_t llv;
    float fv[2];
} conv ;


int main(int argc, char **argv)
{
    conv a;
    conv b;

    a.fv[0] = 1.0;
    a.fv[1] = -2.0;

    printf("%f %f\n",a.fv[0],a.fv[1]);
    b.llv = a.llv;
    printf("%f %f\n",b.fv[0],b.fv[1]);
}


Try the following:

    void *float1_bits = &float1;
    void *float2_bits = &float2;
    long long packed_floats = (*(long long*)float2_bits << 32) | (*(long long *)float1_bits);


The standard float and double are 4 and 8 bytes, so you can't directly pack two of them into one int32 object as it is only 4 bytes.

However, you could define a 16-bit format with a limited precision and exponent range. (I didn't check to see if there already was a standard. There probably is but this is more fun.)

f e d c|b a 9 8|7 6 5 4|3 2 1 0  
S E E E|E E m m|m m m m|m m m m

My new format, called Ross-float, has a sign bit, 5 bits of exponent, and 10 bits of precision.

This format can represent integers exactly from -1023 to +1023, and it can represent real numbers in about the range of 10-4.8   to   104.8.

Update:

Ok, now the question has changed to long long, so the whole problem becomes trivial. That's just as well, because 16 bits is not enough for a good floating point number.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜