IEEE 754: How exactly does it work?

2023-03-27 02:09 问答作者：

Why does the following code behave as it does in C?

float x = 2147483647; //2^31
printf("%f\n", x); //Outputs 2147483648

Here is my thought process:

2147483647 =   0      1001 1101      1111 1111 1111 1111 1111 111

   (0开发者_运维百科.11111111111111111111111)base2 = (1-(0.5)^23)base10
=> (1.11111111111111111111111)base2 = (1 + 1-(0.5)^23)base10 = (1.99999988)base10

Therefore, to convert the IEEE 754 notation back to decimal: 1.99999988 * 2^30 = 2147483520

So technically, the C program must have printed out 2147483520, right?

The value to be represented would be 2147483647. the next two values which can be represented this way are 2147483520 and 2147483648.

As the latter is closer to the unrepresentable "ideal one", it gets used: in floating point, the values get rounded, not truncated.

The standard is available here. You might have to purchase it, as IEEE (and other organizations like it) mainly make their money by selling the standard, to defray their costs in assembling, lobbying for acceptance, and improving the quality of the standard.

The bits only mean what someone designates them to be

"When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean -- neither more nor less." "The question is," said Alice, "whether you can make words mean so many different things." "The question is," said Humpty Dumpty, "which is to be master - - that's all." (Through the Looking Glass, Chapter 6)

In this case IEEE has decided what the bits mean, and the reason that the printf flag %f prints out the right corresponding human representation is due to the flag also following the same standard.

Occasionally you can manage to cast the bits into another data type (like an int) and print out the "other" representation of those bits. C will catch a lot of the normal number promotions, but you can confuse it, generally with the assistance of assigning pointer of the wrong type to the correct address (and dereferencing them).

Note that while you are doing the math by hand, the actual hardware isn't guaranteed to do the math exactly as you would. With integer math there is much more accuracy in the representation, but with floating point math, how you round a number makes a big difference in the output. That's not even mentioning the floating point errors which sometimes were burned into systems (thankfully not often).

Floating point formats are often in a "normalized form" where the most significant bit of the mantissa is always 1. Since it's always 1, you don't need to use up a bit to store it. So when decoding such a number representation, you'll need to add back the 1 at the top.

2147483647 = 2^31 - 1 = +1 * 2^30 * 1.1111 1111 1111 1111 1111 1111 1111 11

When encoding this number in the IEEE 754-1985 single precision format, the significand is rounded properly. For the rounding mode round to nearest even (the default rounding mode) this means it gets rounded up.

Before rounding:

exponent = 30, significand = 1.1111 1111 1111 1111 1111 1111 1111 11

After rounding the significand to 23 digits after the decimal point:

exponent = 30, significand = 10.0000 0000 0000 0000 0000 000

After normalizing:

exponent = 31, significand = 1.0

Encoded in the single precision format:

1 | 10011110 | 00000000000000000000000

继续阅读：c floating-accuracy floating-point ieee-754

IEEE 754: How exactly does it work?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？