Why do two floating point multiplies give a different answer than one?
I recently ran into an issue where I wasn't getting the numerical result I expected. I tracked it down to the problem that is illustrated by the following example:
#include <stdio.h>
int main()
{
double sample = .5;
int a = (int)(sample * (1 << 31));
int b = (int)(sample * (1 开发者_Go百科<< 23) * (1 << 8));
printf("a = %#08x, b = %#08x\n", a, b);
}
// Output is: a = 0xc0000000, b = 0x40000000
Why is the result of multiplying by (1 << 31) different than the result of multiplying by (1 << 23) * (1 << 8)? I expected the two to give the same answer but they don't.
I should note that all my floating point values are in the range [-1, 1).
You are apparently expecting identical results since you assume that to multiply by (1 << 31) is the same as to multiply by (1 << 23) and then by (1 << 8). In general case they are not the same. You are performing the (1 << 31) calculation in a signed int domain. If your platform uses 32-bit ints, the (1 << 31) expression overflows, while both (1 << 23) and (1 << 8) are not overflowing. This immediately means that the result of the first multiplication is unpredictable.
In other words, it doesn't make any sense to do (1 << 31) on a platform that has only 31 bits in the value representation of int type. You need at least 32 value-forming bits to meaningfully calculate (1 << 31).
If you want your (1 << 31) to make sense, calculate in it the unsigned domain: (1u << 31), (1u << 23) and (1u << 8). That should give you consistent results. Alternatively, you can use a larger signed integer type.
加载中,请稍侯......
精彩评论