Float subtraction returns incorrect value

2023-02-15 21:00 问答作者：

So I have a calculation whereby two floats that are components of vector objects are subtracted and then seem to return an incorrect result.

The code I'm attempting to use is:

cout << xresult.x << " " << vec1.x << endl;
float xpart1 = xresult.x - vec1.x;
cout << xpart1 << endl;

Where running this code will return

16 17
-1.00002

As you can see, printing out the values of xresult.x and vec1.x tells you that they are 16 and 17 respectively, yet the subtraction operation seems to introduc开发者_Python百科e an error.

Any ideas why?

As you can see, printing out the values of xresult.x and vec1.x tells you that they are 16 and 17 respectively, yet the subtraction operation seems to introduce an error.

No, it doesn't tell us that at all. It tells us that the input values are approximately 16 and 17. The imprecision might, generally, come from two sources: the nature of floating-point representation, and the precision with which the numbers are printed.

Output streams print floating-point values to a certain level of precision. From a description of the std::setprecision function:

On the default floating-point notation, the precision field specifies the maximum number of meaningful digits to display in total counting both those before and those after the decimal point.

So, the values of xresult.x and vec1.x are 16 and 17 with 5 decimal digits of accuracy. In fact, one is slightly less than 16 and the other slightly more than 17. (Note that this has nothing to do with imprecise floating-point representation. The declarations float f = 16 and float g = 17 both assign exact values. A float can hold the exact integers 16 and 17 (although there are infinitely many other integers a float cannot hold.)) When we subtract slightly-more-than-17 from slightly-less-than-16, we get an answer of slightly-larger-than-negative-1.

To prove to yourself that this is the case, do one or both of these experiments. First, in your own code, add "cout << std::setprecision(10)" before printing those values. Second, run this test program

#include <iostream> 
#include <iomanip>

int main() {
  for(int i = 0; i < 10; i++) {
    std::cout << std::setprecision(i) <<
      15.99999f << " - " << 17.00001f << " = " <<
      15.99999f - 17.00001f << "\n";
  }
}

Notice how the 7th line of output matches your case:

16 - 17 = -1.00002

P.s. All of the other advice about imprecise floating-point representation is valid, it just doesn't apply to your particular circumstance. You really should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

This is called floating point arithmetic. It is why numerical code is so "tricky" and filled with pitfalls. That result is expected. And what is more, it can depend on the processor that you're working with as to what and to what extent you'll see it.

I'd like to add that each type of variable of the floating point variables: float, double, long double have different precision factors. That is, one may be more able to represent more accurately the value of the floating point number. That is evidenced by how these numbers are held in memory.

When you look at a float, it contains less significant digits than say a double or long double. Hence, when you perform numerics on them, you must expect that floats will suffer from larger rounding errors. When dealing with financial data, developers often use some semblance of a "decimal." These are much better designed to handle currency type manipulations with better accuracy of the significant digits. It comes with a price however.

Take a look at the IEEE 745-2008 specification.

Its because of how floating points work. http://en.wikipedia.org/wiki/Floating_point

Because you can't accurately represent all numbers using a float. Wikipedia has a good description of it: http://en.wikipedia.org/wiki/Floating_point

How much do you know about the way numbers are stored in a computer?

Also, what are xresult.x and vec1.x - as in are they ints etc or floats.

I'd be suprised that if they were all floats the error occured, but you are converting between types and binary is not the same as decimal.

If there was a small decimal portion on the 16 and 17 that wasn't printed out, when the values are normalized to the same base for subtraction, that could introduce extra error, especially for 32 bit types like float.

When you use floating point values, you need to be prepared within your application to deal with the fact that you won't get 100% accurate decimal results. Your results will be as accurate as possible in the internal binary representation. Addition and subtraction especially can introduce a significant amount of relative error for operands that are orders of magnitude apart and for results that should be close to 0.

People keep talking about how computer representations cannot perfectly represent real numbers, and how computer operations on floating point numbers cannot be perfectly precise.

This is true, but the same is true of the real world.

Real measurements are approximations to some degree of precision. Operations on real measurements result in approximations to some degree of precision.

If I count 17 bowling balls, I have 17 bowling balls. If I remove 16 bowling balls, I have one bowling ball.

But if I have a stick that is 17 inches long, what I really have is a stick that is about 17 inches long. If I cut off 16 inches, I'm really cutting off is about 16 inches, and what I'm left with is about 1 inch.

You have to keep track of the accuracy of your measurements, and the precision of your results. If I have 17.0, accurate to three significant digits, and subtract 16.0, also accurate to three significant digits, the result is 1.0, accurate to two significant digits. And that's what you got. Your mistake was in assuming that the extra precision provided by your results, beyond the accuracy you were given, was meaningful. It's not. It's meaningless noise.

This isn't something specific to computer floating point numbers, you have the same issue whether using a calculator or working out the problems by hand.

Keep track of your significant digits, and format your answers to suppress precision beyond what is significant.

Make your variables doubles instead of floats. Youl get more precision.

EDIT Computers store numbers using a sequence of bits. The more bits you store the higher the precision of the result. In fact floats usually have half the number of bits as doubles so they have lower precision.

Float subtraction returns incorrect value

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？