Replacing a c for loop with cuda

2023-03-16 14:17 问答作者：

What is the best way to do this in CUDA?

...
for(int i=0;i<size;++i)                                                                             
  for(int j=i+1;j<size ;++j)                                                                           
    temp_norm+=exp((train[i]-train[j])/tau);

Would this be equivalent?

...
int i = threadIdx.x + blockIdx.x * blockDim.x;
int j = threadIdx.y + blockIdx.y * blockDim.y;

if (i>=size || j>=size) return;

if(j>i)
  temp_norm+=exp((train[i]-train[j])/tau开发者_JAVA技巧);

Any help would be much appreciated!

How best to implement really depends on how big size is. But assuming it is quite large, e.g. 1000 or more...

To do it the way you suggest, you would need to use atomicAdd(), which can be expensive if too many threads atomically add to the same address. A better way is probably to use a parallel reduction.

Check out the "reduction" sample in the NVIDIA CUDA SDK.

YMMV with the following since it is untested, and I don't know your data size, but something like this should work. Use the "reduction6" kernel from that example, but add your computation to the first while loop. Replace the initialization of i and gridSize with

unsigned int i = blockIdx.x*blockSize + threadIdx.x;
unsigned int gridSize = blockSize * gridDim.x;

Replace the while (i < n) loop with

while (i < size)
{
  for (unsigned int j = i+1; j<size; ++j)
      mySum += exp((train[j]-train[i])/tau);   
  i += gridSize;
}

(Note, floating point arithmetic is non-associative, so the different order of operations in a parallel implementation may give you a slightly different answer than the sequential implementation. It may even give you a slightly more accurate answer due to the balanced tree reduction, depending on your input data.)

继续阅读：c

Replacing a c for loop with cuda

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？