Understanding floating point representation errors; what's wrong with my thinking?

2023-03-11 22:17 问答作者：

I'm having some trouble understanding why some figures can't be represented with floating point number.

As we know, a normal float would have sign bit, exponent, and mantissa. Why can't, for example, 0.1 be represented accurately in this system; the way I think of it would be that you put 10 (1010 in bin) to mantissa and -2 to the exponent. As far as I know, both numbers can be accurately rep开发者_StackOverflow社区resented in the mantissa and exponent. So why can't we represent 0.1 accurately?

If your exponent is decimal (i.e. it represents 10^X), you can precisely represent 0.1 -- however, most floating point formats use binary exponents (i.e. they represent 2^X). Since there are no integers X and Y such that Y * (2 ^ X) = 0.1, you cannot precisely represent 0.1 in most floating point formats.

Some languages have types with both exponents. In C#, for example, there is a data type aptly named decimal which is a floating point format with a decimal exponent so it will support storing a number like 0.1, although it has other uncommon properties: The decimal type can distinguish between 0.1 and 0.10, and it is always true that x + 1 != x for all values of x.

For most common purposes, though, C# also has the float and double floating point types that cannot precisely store 0.1 because they use a binary exponent (as defined in IEEE-754). The binary floating point types use less storage, are faster because they are easier to implement, and have more operations defined on them. In general decimal is only used for financial values where the exact representation of all decimal values is important and the storage, speed, and range of operations are not.

You must start reading What Every Computer Scientist Should Know About Floating-Point Arithmetic

Check out :

Floating-Point Number Tutorial
Tutorial: Floating-Point Binary

Each floating-point number in the IEEE 754 standard is, in effect, some integer multiplied by some integer power of two. E.g., 3 is represented by 3 * 2⁰, 96 is represented by 3 * 2³, and 3/16 is represented by 3 * 2^-4.

There are no integers x and y such that .1 = x * 2^y, therefore .1 cannot be exactly represented by a floating-point number. Proof: If .1 = x * 2^y, then 10x = 2^-y. 2^-y is clearly positive, so x is positive. It is also an integer, so 10x is divisible by 10, so it is divisible by 5. Therefore 2^-y is a power of two that is divisible by 5, which is clearly impossible.

That would be 10 × 2^-1 = 5, not 0.1.

Generally, it's like representing one-third in base ten: it's just not possible with a finite number of digits.

By the way, 10₁₀ = 1010₂ ≠ 1100₂.

You're thinking about 1* 10^-1, which works for a decimal floating number representation, such as decimal in C#. The normal floating point (such as float, double) uses binary representation, i.e. in powers of 2

Normally, binary is used because they can be more efficiently arranged in bits. Decimal is normally used when absolute decimal precision is required, for example when counting money.

继续阅读：floating-accuracy floating-point

Understanding floating point representation errors; what's wrong with my thinking?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？