开发者

Disabling ALL asynchronous execution in CUDA programs

According to the CUDA programming guide, you can disable asynchronous kernel launches at run time by setting an environment variable (CUDA_LAUNCH_BLOCKING=1).

This is a helpful tool for debugging. I also want to determine the benefit in my code from using concurrent kernels and transfers.

I want to also disable other concurrent calls, in particular cudaMemcpyAsync.

Does CUDA_LAUNCH_BLOC开发者_运维百科KING affect these kinds of calls in addition to kernel launches? I suspect not. What would be the best alternative? I can add cudaStreamSynchronize calls, but I would prefer a run time solution. I can run in the debugger, but that will affect the timing and defeat the purpose.


Setting CUDA_LAUNCH_BLOCKING won't effect the streams API at all. If you add some debug code to force all your streams code to use stream 0, all the calls other than kernel calls will revert to synchronous behaviour.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜