开发者

Sparse array in CUDA or OpenCL

I have a large array (say 512K elements), GPU res开发者_如何学Cident, where only a small fraction of elements (say 5K randomly distributed elements - set S) needs to be processed. The algorithm to find out which elements belong to S is very efficient, so I can easily create an array A of pointers or indexes to elements from set S.

What is the most efficient way to run a CUDA or OpenCL kernel only over elements from S? Can I run a kernel over array A? All examples I've seen so far deal with contiguous 1D, 2D, or 3D arrays. Is there any problem with introducing one layer of indirection?


In CUDA contiguous (not random) memory access is preferred due to possible use of memory coalescing. It's not a big deal to create array of randomly distributed indexes and proceed one index from A per thread, something like this:

__global__ kernel_func(unsigned * A, float * S)
{
    const unsigned idx = threadIdx.x + blockIdx.x * blockDim.x;
    const unsigned S_idx = A[idx];

    S[S_idx] *= 5; // for example...
    ...
}

But memory access to S[random access] will be very slow (here will be a most possible bottleneck).

If you decide to use CUDA, then you must experimenting a lot with blocks/grid sizes, minimize register consumption per thread (to maximize number of blocks per multiprocessor) and maybe sort A to use nearest S_ind from nearest threads...


if you sort your indexes or build the list sorted that will help performance allot, if there are clusters of indexes then try using texture memory, and if you are accessing a number of elements from each thread with some over lap the i found using the shared memory gives a significant performance boost.


No problem at all with the one level of indirection. I use that a fair amount in my own CUDA code. Is the set S likely to remain static over time? If so, it may very well be worth generating the lookup A like you said.

Also, texture memory will be your friend in providing cache locality. The type of texture you use (1D, 2D, or 3D) will depend on your problem.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜