CUDA fast math operations

2023-03-22 11:02 问答作者：

This is my code,

    __device__ void calculateDT(float *devD, int *devImg, int cntVoxelLi, int *neighVoxels)
   {
    float minV = devD[cntVoxelLi];
   int cv = devImg[cntVoxelLi];
   float v = 0,cuVal = 0;
   int c1=0,d1=0,r1=0;
   GetInd2Sub(cntVoxelLi, r1,c1,d1);

   for(int ind=0;ind<开发者_C百科9;ind++)
   {
    v = pow(float(cv - devImg[neighVoxels[ind]]),2);
    cuVal = devD[neighVoxels[ind]]  + (1-exp(-v/100));
    minV = min(minV, cuVal);
   }
   devD[cntVoxelLi] = minV;
   }

When I run the entire program it takes about 15seconds. But when I remove the

     exp(-v/100)

it takes only 7 seconds. It seems to be this exp operation takes much time. I tries with expf function as well. How can I improve the performance?

The performance difference you ar seeing is mostly the result of compiler optimization. When you remove the exp expression, the variable v becomes unused, and the compiler will remove the calculation of v because it is effectively dead code. So the large drop in execution time is due to the elimination of all of the floating point computation fom the kernel loop, not from the removal of the exp function alone.

As for performance optimizations, the obvious one is to eliminate the use of pow for computing a simple square (the compiler might be doing this itself), and tidy up all of the floating point expressions to eliminate a number of implicit integer-floating point conversions (hint: 0 is an integer, 0. is double precision and 0.f is single precision).

It is difficult to comment on memory transaction performance in the kernel from the code you have posted. The CUDA 4 visual profiler has some useful diagnostics which show whether a piece of code is memory or arithmetic limited. You might find it useful to profile the code and see what it reports.

CUDA fast math operations

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？