Is the use of machine epsilon appropriate for floating-point equality tests?
This is a follow-up to Testing for floating-point value equality: Is there a standard name for the “precision” constant?.
There is a very similar question Double.Epsilon
for equality, greater than, less than, less than or equal to, greater than or equal to.
It is well known that an equality test for two floating-point values x and y should look more like this (rather than a straightforward =):
abs( x - y ) < epsilon , where epsilon is some very small value.
How to choose a value for epsilon?
It would obviously be preferable to choose for epsilon as small a value as possible, to get the highest-possible precision for the equality check.
As an example, the .NET framework offers a constant System.Double.Epsilon
(= 4.94066 × 10-324), which represents the smallest positive System.Double
value that is greater than zero.
However, it turns out that this particular value can't be reliably used as epsilon, since:
0 +
System.Double.Epsilon
≠ 01 +
System.Do开发者_运维技巧uble.Epsilon
= 1 (!)
which is, if I understand correctly, because that constant is less than machine epsilon.
→ Is this correct?
→ Does this also mean that I can reliably use epsilon := machine epsilon for equality tests?
Removed these two questions, as they are already adequately answered by the second SO question linked-to above.
The linked-to Wikipedia article says that for 64-bit floating-point numbers (ie. the double
type in many languages), machine epsilon is equal to:
2-53, or approx. 0.000000000000000111 (a number with 15 zeroes after the decimal point)
→ Does it follow from this that all 64-bit floating point values are guaranteed to be accurate to 14 (if not 15) digits?
How to choose a value for epsilon?
Short Answer: You take a small value which fits your applications needs.
Long Answer: Nobody can know which calculations your application does and how accurate you expect your results to be. Since rounding errors sum up machine epsilon will be almost all times far too big so you have to chose your own value. Depending on your needs, 0.01 be be sufficient, or maybe 0.00000000000001 or less will.
The question is, do you really want/need to do equality tests on floating point values? Maybe you should redesign your algorithms.
In the past when I have had to use an epsilon value it's been very much bigger than the machine epsilon value.
Although it was for 32 bit doubles (rather than 64 bit doubles) we found that an epsilon value of 10-6 was needed for most (if not all) calculated values in our particular application.
The value of epsilon you choose depends on the scale of your numbers. If you are dealing with the very large (10+10 say) then you might need a larger value of epsilon as your significant digits don't stretch very far into the fractional part (if at all). If you are dealing with the very small (10-10 say) then obviously you need an epsilon value that's smaller than this.
You need to do some experimentation, performing your calculations and checking the differences between your output values. Only when you know the range of your potential answers will you be able to decide on a suitable value for your application.
The sad truth is: There is no appropriate epsilon for floating-point comparisons. Use another approach for floating-point equality tests if you don't want to run into serious bugs.
Approximate floating-point comparison is an amazingly tricky field, and the abs(x - y) < eps
approach works only for a very limited range of values, mainly because of the absolute difference not taking into account the magnitude of the compared values, but also due to the significant digit cancellation occurring in the subtraction of two floating-point values with different exponents.
There are better approaches, using relative differences or ULPs, but they have their own shortcomings and pitfalls. Read Bruce Dawson's excellent article Comparing Floating Point Numbers, 2012 Edition for a great introduction into how tricky floating-point comparisons really are -- a must-read for anyone doing floating-point programming IMHO! I'm sure countless thousands of man-years have been spent finding out the subtle bugs due to naive floating-point comparisons.
I also have questions regarding what would be the correct procedure. However I believe one should do:
abs(x - y) <= 0.5 * eps * max(abs(x), abs(y))
instead of:
abs(x - y) < eps
The reason for this arises from the definition of the machine epsilon. Using python code:
import numpy as np
real = np.float64
eps = np.finfo(real).eps
## Let's get the machine epsilon
x, dx = real(1), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which gives: eps = 2.220446e-16 dx = 1.110223e-16 eps*x/2 = 1.110223e-16
## Now for x=16
x, dx = real(16), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which now gives: eps = 2.220446e-16 dx = 1.776357e-15 eps*x/2 = 1.776357e-15
## For x not equal to 2**n
x, dx = real(36), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which returns: eps = 2.220446e-16 dx = 3.552714e-15 eps*x/2 = 3.996803e-15
However, despite the difference between dx and eps*x/2, we see that dx <= eps*x/2
,
thus it serves the purpose for equality tests, checking for tolerances when testing for convergence in numerical procedures, etc.
Such is similar to what is in: www.ibiblio.org/pub/languages/fortran/ch1-8.html#02, however if someone knows of better procedures or if something here is incorrect, please do say.
精彩评论