开发者

Floating point C++ compiler options | preventing a/b -> a* (1/b)

I'm writing realtime numeric software, in C++, currently compiling it with Visual-C++ 2008. Now using 'fast' floating point model (开发者_开发技巧/fp:fast), various optimizations, most of them useful my case, but specifically:

a/b -> a*(1/b) Division by multiplicative inverse

is too numerically unstable for a-lot of my calculations.

(see: Microsoft Visual C++ Floating-Point Optimization)

Switching to /fp:precise makes my application run more than twice as slow. Is is possible to either fine-tune the optimizer (ie. disable this specific optimization), or somehow manually bypass it?

- Actual minimal-code example: -

void test(float a, float b, float c,
    float &ret0, float &ret1) {
  ret0 = b/a;
  ret1 = c/a;
} 

[my actual code is mostly matrix related algorithms]

Output: VC (cl, version 15, 0x86) is:

divss       xmm0,xmm1 
mulss       xmm2,xmm0 
mulss       xmm1,xmm0 

Having one div, instead of two is a big problem numerically, (xmm0, is preloaded with 1.0f from RAM), as depending on the values of xmm1,2 (which may be in different ranges) you might lose a lot of precision (Compiling without SSE, outputs similar stack-x87-FPU code).

Wrapping the function with

#pragma float_control( precise, on, push )
...
#pragma float_control(pop)

Does solve the accuracy problem, but firstly, it's only available on a function-level (global-scope), and second, it prevents inlining of the function, (ie, speed penalties are too high)

'precise' output is being cast to 'double' back and forth as-well:

 divsd       xmm1,xmm2 
 cvtsd2ss    xmm1,xmm1 
 divsd       xmm1,xmm0 
 cvtpd2ps    xmm0,xmm1 


Add the

#pragma float_control( precise, on)

before the computation and

#pragma float_control( precise,off)

after that. I think that should do it.


That document states that you can control the float-pointing optimisations on a line-by-line basis using pragmas.


There is also __assume. You can use __assume(a/b != (a*(1/b))). I've never actually used __assume, but in theory it exists exactly to fine-tune the optimizer.


Can you put the functions containing those calculations in a separate source code file and compile only that file with the different settings?

I don't know if that is safe though, you'll need to check !


(Weird) solution which I have found: whenever dividing by the same value in a function - add some epsilon:

    a/b; c/b 

->

    a/(b+esp1); c/(b+esp2)

Also saves you from the occasional div by zero

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜