Passing cuda context to worker pthreads
I have some CUDA kernels I want to run in individual pthreads.
I basically have to have each pthread execute, say, 3 cuda kernels, and they must be executed sequentially.
I thought I would try to pass each pthread a reference to a stream, and so each of those 3 cud开发者_开发问答a kernels would all execute sequentially, in the same stream.
I could get this working with a different context for pthread, which would then execute the kernels as normal, but that seems to take a lot of overhead.
So how do I make each pthread work in the same context, concurrently with the other pthreads?
Thanks
Before CUDA 4.0, the way to access a given context from different CPU threads was to use cuCtxPopCurrent()/cuCtxPushCurrent(). A context could only be current to one CPU thread at a time.
In CUDA 4.0, you can call cudaSetDevice() in each pthread and it can be current to more than one thread at a time.
The kernel invocations will be serialized by the context in the order received, but you may have to perform CPU thread synchronization to make sure the work is submitted in the order desired.
精彩评论