How to get the upper-/lower machine-word of a double according to IEEE 754 (ansi-c)?
i want to use the sqrt implementation of fdlibm.
This implementation defines (according to the endianess) some macros for accessing the lower/upper 32-bit of a double) in the following way (here: only the little-endian-version):#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
#define __HIp(x) *(1+(int*)x)
#define __LOp(x) *(int*)x
The readme of flibm is saying the following (a little bit shortened)
Each double precision floating-point number must be in IEEE 754
double format, and that each number can be retrieved as two 32-bit
integers through the using of 开发者_如何学Cpointer bashing as in the example
below:
Example: let y = 2.0
double fp number y: 2.0
IEEE double format: 0x4000000000000000
Referencing y as two integers:
*(int*)&y,*(1+(int*)&y) = {0x40000000,0x0} (on sparc)
{0x0,0x40000000} (on 386)
Note: Four macros are defined in fdlibm.h to handle this kind of
retrieving:
__HI(x) the high part of a double x
(sign,exponent,the first 21 significant bits)
__LO(x) the least 32 significant bits of x
__HIp(x) same as __HI except that the argument is a pointer
to a double
__LOp(x) same as __LO except that the argument is a pointer
to a double
If the behavior of pointer bashing is undefined, one may hack on the
macro in fdlibm.h.
I want to use this implementation and these macros with the cbmc model checker, which should be conformable with ansi-c.
I don't know exactly whats wrong, but the following example shows that these macros aren't working (little-endian was chosen, 32-bit machine-word was chosen):temp=24376533834232348.000000l (0100001101010101101001101001010100000100000000101101110010000111)
high=0 (00000000000000000000000000000000)
low=67296391 (00000100000000101101110010000111)
Both seem to be wrong. High seems to be empty for every value of temp.
Any new ideas for accessing the both 32-words with ansi-c?
UPDATE: Thanks for all your answers and comments. All of your proposals worked for me. For the moment i decided to use "R.."s version and marked this as favorite answer because it seems to be the most robust in my tool regarding endianness.
Why not use an union?
union {
double value;
struct {
int upper;
int lower;
} words;
} converter;
converter.value = 1.2345;
printf("%d",converter.words.upper);
(Note that the behaviour code is implementation-dependent and relies on internal representation and specific data sizes)
On top of that, if you make that struct contain bitfields, you can access the individual floating-point parts (sign, exponent and mantissa) separately:
union {
double value;
struct {
int upper;
int lower;
} words;
struct {
long long mantissa : 52; // not 2C!
int exponent : 11; // not 2C!
int sign : 1;
};
} converter;
Casting pointers like you're doing violates the aliasing rules of the C language (pointers of different types may be assumed by the compiler not to point to the same data, except in certain very restricted cases). A better approach might be:
#define REP(x) ((union { double v; uint64_t r; }){ x }).r
#define HI(x) (uint32_t)(REP(x) >> 32)
#define LO(x) (uint32_t)(REP(x))
Note that this also fixed the endian dependency (assuming the floating point and integer endianness are the same) and the illegal _
-prefix on the macro names.
An even better way might be not breaking it into high/low portions at all, and using the uint64_t
representation REP(x)
directly.
From a standards perspective, this use of unions is a little bit suspect, but better than the pointer casts. Using a cast to unsigned char *
and accessing the data byte-by-byte would be better in some ways, but worse in that you have to worry about endian considerations, and probably a lot slower..
I would suggest taking a look at the disassembly to see exactly why the existing "pointer-bashing" method does not work. In its absence, you might use something more traditional like a binary shift (if you're on a 64-bit system).
精彩评论