Processing buffers bigger than 65536 in Clyther/OpenCL
I am currently in the process of 开发者_运维百科discovering OpenCL via the Python binding Clyther. So far I am messing with a very simple script to get the sin or cos of a buffer of 65536. Apparently 65536 is the limit for buffers on my card but say I'd have 16 million numbers in my buffer how would I go about it without constantly bringing the CPU into it to retrieve/send data?
What I have do so far is, fill buffer, run kernel, retrieve buffer, in a loop but that also hits the CPU badly.
I looked a bit at OpenCL docs but I just failed to understand how that is achieved.
Thank you
This awfully looks like you are using __constant
memory. The solution is to use __global
memory instead, but you have to be careful about how you access it for best performance.
__constant
memory is a special address space for often used constant values, but is restricted in size on current GPUs.
精彩评论