开发者

x86-64 long double precision

What is the actual precision of long double on Intel 64-b开发者_StackOverflow社区it platforms? is it 80 bits padded to 128 or actual 128 bit?

if former, besides going gmp, is there another option to achieve true 128 precision?


x86-64 precision is the same as regular x86. Extended double is 80 bits, using the x87 ISA, with 6 padding bytes added. There is no 128-bit FP hardware.

A software implementation of quad or extended quad precision might benefit from the x86-64 64x64 => 128 integer multiply instruction, though.


I would recommend using MPFR. It is a more sophisticated multiple-precision floating point library that is built on top of GMP.


There is a good chance that it's 64 bit for both (depending on the compiler and OS), because the compiler is emitting scalar SSE2 instead of x87 instructions.

x86 doesn't support higher precision than 80 bits, but if you really need more than 64 bits for a FP algorithm most likely you should check your numerics instead of solving the problem with brute force.


I recommend the Boost wrappers over MPFR or GMP:

Boost 1.70: cpp_bin_float.

As well as arbitrary types to any desired precision, the following types are provided:

cpp_bin_float_single           (24 bits + mantissa = 32 bits)
cpp_bin_float_double           (53 bits + mantissa = 64 bits)
cpp_bin_float_double_extended  (64 bits + mantissa)
cpp_bin_float_quad             (113 bits + mantissa = 128 bits)
cpp_bin_float_oct              (237 bits) + mantissa = 256 bits)

Boost offers almost out-of-the-box functionality. Once compiled, all one needs to do is add a pointer within the Visual Studio project to the include and library directories.

Tested with Visual Studio 2017 + Boost v1.70.

See instructions to compile boost.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜