CUDA, cuPrintf causes "unspecified launch failure"?
I have a kernel which runs twice with different grid size.
My problem is with cuPrintf. When I don't have cudaPrintfInit()
before kernel run and cudaPrintfDisplay(stdout, true)
and cudaPrintfEnd()
after kernel run, I have no error but when I put them there I get "unspecified launch failure" error.
In my device code, there is only one loop like this for printing:
if (threadIdx.x==0) {
cuPrintf("MAX:%f x:%d y:%d\n", maxVal, blockIdx.x, blockIdx.y);
}
I'm using CUDA 4.0 with a card with cuda capabili开发者_开发技巧ty 2.0 and so I'm compiling my code with this syntax:
nvcc LB2.0.cu -arch=compute_20 -code=sm_20
If you are on a CC 2.0 GPU, you don't need cuPrintf at all -- CUDA has printf built-in for CC-2.0 and higher GPUs. So just replace your call to cuPrintf with this:
#if __CUDA_ARCH__ >= 200
if (threadIdx.x==0) {
printf("MAX:%f x:%d y:%d\n", maxVal, blockIdx.x, blockIdx.y);
}
#endif
(Note you only need the #if / #endif lines if you are compiling your code for sm_20 and also earlier versions. With the example compilation command line you gave, you can eliminate them.)
With printf, you don't need cudaPrintfInit() or cudaPrintfDisplay() -- it is automatic. However if you print a lot of data, you may need to increase the default printf FIFO size with cudaDeviceSetLimit(), passing the cudaLimitPrintfFifoSize
option.
精彩评论