开发者

What does pycuda.debug actually do?

As part of a larger project, I've come across a strangely consistent bug that I can't get my head around, but is an archetypical 'black box' bug; when running with cuda-gdb python -m pycuda.debug prog.py -arg开发者_如何转开发s, it runs fine, but slow. If i drop pycuda.debug, it breaks. Consistently, at exactly the same point in multiple-kernel execution.

To explain; I have (currently three) kernels, used in different grid and block arrangements to solve 'slices' of a larger optimisation problem. These strictly speaking should either work, or not, as the functions themselves are told nothing but 'here's some more data', and other than the contents of the data, don't know anything such as iteration number whether their input data is partitioned or not, and up until this one point, they perform perfectly.

Basically, I can't see what's happening without pycuda.debug exposing the debugging symbols to GDB, but I also can't see the problem WITH pycuda.debug.

What does pycuda actually do so I know what to look for in my kernel code?


Almost nothing. It mostly sets compiler flags in the pycuda.driver module so that CUDA code gets compiled with the necessary debugging symbols and assembled in the way CUDA-gdb requires. The rest is a tiny wrapper that nicely encapsulates the pycuda libraries so thar everything works. The whole thing is about 20 lines of python, you can see the code in the source distribution if you want.

The key thing here is that code run in the debugger spills everything from register and shared memory to local memory, so that the driver can read local program state. So if you have code that runs when built for the debugger and fails when built normally, it usually means there is a shared memory buffer overflow or pointer error which is causing the GPU equivalent of a segfault.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜