How do pass data to a shared variable in CUDA?
I have a kernel which passes 3 arrays, the first array d_A1 has no data and is used only to write back data, the other two arrays d_D1 and d_ST1 have data.
The size of the first array is:
d_A1[13000000]
The size of the second array is:
d_D1[421]
The size of the third array is:
d_ST1[21]
N
is 13000000
TestArray<<<n_blocks, block_size>>>(d_A1,N, d_D1, d_ST1);
Now I want only pass the data of d_D1[421] and d_ST1[21] to shared arrays so I created the shared arrays as:
__global__ void TestArray(int* A1, unsigned int N, int* D1, unsigned int* ST1)
{
unsigned int __align__(16) tid = threadIdx.x;
unsigned int __align__(16) idx = __umul24(blockDim.x, blockIdx.x) + threadIdx.x;
__shared__ unsigned int __align__(16) s_D1[441]; //Shared array for d_D1
__shared__ unsigned int __align__(16) s_ST1[21]; //Shared array for d_ST1
if (idx < N) //13000000
{
Q. How do I pass the data of d_D1[441] and d_ST1[21] to s_D1[441] and s_ST1[21]? I tried:
while (idx < 441)
s_D1[tid] = d_D1[idx]
__syncthreads();
while (idx < 21)
s_ST1[tid] = d_ST1[idx]
__syncthreads();
but the computer freezes and I have to restart it. I also tried one at the 开发者_StackOverflow中文版time,namely, only the fist while and then only the second while, with no luck.
If I use the global memory, namely, d_D1, d_ST1 everything works. So the question is: How do you pass data to a shared variable/array when the size of the array is not N?
} //End of kernel processing
}
Great, you're right, it has to be if (tid < 21) d_ST1[tid] = ST1[tid] since one block is enough for those two size. Thanks you very much indeed. However, when do you use if (idx < n) d_st1[tid] = st1[idx]?
精彩评论