Strange float behaviour in OpenMP

2023-02-05 13:53 问答作者：

I am running the following OpenMP code

        #pragma omp parallel shared(S2,nthreads,chunk) private(a,b,tid)
    {
        tid = omp_get_thread_num();
        if (tid == 0)
        {
            nthreads = omp_get_num_threads();
            printf("\nNumber of threads = %d\n", nthreads);
        }
        #pragma omp for 开发者_如何学Cschedule(dynamic,chunk) reduction(+:S2)
        for(a=0;a<NREC;a++){
            for(b=0;b<NLIG;b++){
                S2=S2+cos(1+sin(atan(sin(sqrt(a*2+b*5)+cos(a)+sqrt(b)))));
            }
        } // end for a
    } /* end of parallel section */

And for NREC=NLIG=1024 and higher values, in a 8 core board, I get up to 7 speedup. The problem is that if I compare the final results for variable S2, it differs between 1 to 5% to the exact results obtained in the serial version. What could be the reason? Should I use some specific compilation options to avoid this strange float behaviour ?

The order of additions/subtractions of floating-point numbers can affect the accuracy.

To take a simple example, let's say that your machine stores 2 decimal digits, and that you're computing the value of 1 + 0.04 + 0.04.

If you do the left addition first, you get 1.04, which is rounded to 1. The second addition will give 1 again, so the final result is 1.
If you do the right addition first, you get 0.08. Added to 1, this gives 1.08 which is rounded to 1.1.

For maximum accuracy, it's best to add values from small to large.

Another cause could be that float registers on the CPU may contain more bits than floats in main memory. Hence, if some intermediate result is cached in a register, it is more accurate, but if it gets swapped out to memory it gets truncated.

See also this question in the C++ FAQ.

It is known that machine floating-point operations are flawed when two large values are subtracted (or two large values with different signs are added) yielding the small difference as a result. Thus, summing an oscillated-sign sequences may introduce severe error on each iteration. Another flawed case is when magnitudes of two operands differ much - the lesser operand virtually cancels itself.
It might be useful to separate positive and negative operands, and perform summation of each group separately, then add (subtract) the group results.
If accuracy is crucial, it would probably require the need of pre-sorting of each of the groups, and perform two sums inside each. First sum will go from the center towards the largest (head), second will go from the smallest (tail) towards the center. Resultant group sum will be the sum of the partial runs.

继续阅读：c debugging floating-point openmp

Strange float behaviour in OpenMP

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？