Casting UINT64 to float?
Is it safe to cast a UINT64 to a float开发者_运维知识库? I realize that UINT64 does not hold decimals, so my float will be whole numbers. However, my function to return my delta-time returns a UINT64, which isn't a very useful type for the function I'm currently working with. I'm assuming a simple static_cast<float>(uint64value) will not work?
Large values of UINT64, (an 8 byte value) may be truncated if you cast them to a float, which is only 4 bytes.
Define safe - you can easily lose a lot of digits of precision if the 64-bit value is large, but apart from that (which is presumably a known issue that you don't mind about), the conversion should be safe. If your compiler doesn't handle it correctly, get a better compiler.
You might try performing your arithmetic in a long double
or double
first:
typedef long double real_type
real_type x = static_cast<real_type>(long1);
real_type y = static_cast<real_type>(long2);
real_type z = x / y;
float result = static_cast<float>(real_type);
Rule of thumb: int can be cast to and back from double
It is safe to cast to and back from float but you will be limited to rather small numbers, about 16 million, and if you exceed the allowed magnitude you will silently lose lower-order precision. With double, you can use much larger integers.
Assuming an IEEE 754 underlying floating point system, you will be able to accurately cast integers of 23 bits to and from float and 52 bits to and from double. Actually, you get one more bit because of the hidden bit, so you can fit an integer up to and including 1FFFFFFFFFFFFF or 9007199254740991 in a double.
So every single 32-bit integer has an exact representation in double; it can be cast to and back safely, and the ordinary arithmetic operations on them will produce exact results.
Indeed, this is what JavaScript does for every integer numeric operation. People who say "floating point is inaccurate" are drastically oversimplifying the matter.
Safe? What do you mean by safe? As far as the precision is concerned, IEEE-754 float
has a 23-(+1)bit mantissa. By forcefully converting a 64-bit value into a "rounded" 24 bit value, you'll inflict a massive loss of precision in the wide range of least-significant bits. Is this loss acceptable in your application? Frankly, if your original value really makes use of the 64-bit range, forcing it into something as small as float
doesn't sound as a good idea to me.
why wouldn't static_cast work?
Max uint64 is 2^64 = 1.84467441 × 10^19
According to this max 32-bit float is
9.999999×10^96.
Should work... having problems?
http://en.wikipedia.org/wiki/Decimal32_floating-point_format
精彩评论