开发者

How many dimensions for the grid can i use on cuda compute capability 2.0 card?

I want to use a 3D-grid for calculations with cuda. This page [1] or this answer [2] says I can use three dimensions for this, but querying my device properties gives me the following:

   --- General Information for device 0 ---
Name:  Quadro 4000
Compute capability:  2.0
Clock rate:  950000
Device copy overlap:  Enabled
Kernel execution timeout :  Enabled
   --- Memory Information for device 0 ---
Total global mem:  2146631680
Total constant Mem:  65536
Max mem pitch:  2147483647
Texture Alignment:  512
   --- MP Information for device 0 ---
Multiprocessor count:  8
Shared mem per mp:  49152
Registers per mp:  32768
Threads in warp:  32
Max threads per block:  1024
Max thread dimensions:  (1024, 1024, 64)
Max grid dimensions:  (65535, 65535, 1)

If I try to use a 3D grid with my code nothing happens:

__global__ void updateBuffer( ... )
{
  int x = blockIdx.x;
  int y = blockIdx.y;
  int z = threadIdx.x;

  int offset =
      x +
      y * width +
      z * width * height;

  buffer[offset] = ...;
}

__global__ void updateBuffer2( ... )
{
  int x = blockIdx.x;
  int y = blockIdx.y;
  int z = blockIdx.z;

  int offset =
      x +
      y * width +
      z * width * height;

  buffer[offset] = ...;
}

void callKerner() {
  dim3 blocks(extW,extH,1);
  dim3 threads(extD,1,1);

  dim3 blocks2(开发者_如何转开发extW,extH,extD);
  dim3 threads2(1,1,1);


  updateBuffer<<<blocks,threads>>>( ... ); // works fine
  updateBuffer2<<<blocks2,threads2>>>( ... ); // nothing happens
}

So is it that 3d grid do not work with some cards?

[1] http://en.wikipedia.org/wiki/CUDA#Version_features_and_specifications [2] Maximum blocks per grid:CUDA


i fixed it by installing the latest nvidia driver and updating to cuda 4.0


There's a clue in this line:

Max grid dimensions:  (65535, 65535, 1)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜