opencl device info, amount of local memory
My question is about the opencl call clGetDeviceInfo with CL_DEVICE_LOCAL_MEM_SIZE as the argument.
Does it return the per work group amount of local memory, or is it the total amount of memory available as local on the device? Or anything else?
My GPU is an Nvidia GeForce 9800 GT and the returned value is 16K for the above call开发者_StackOverflow中文版.
Thanks in advance!
It's per compute unit. The local memory is used by all workgroups executed on the compute unit. One single group can't exceed this size, since it must be executed on a single compute unit.
For example, in your case, if each workgroup requires 8K of local memory, at most two workgroups can be scheduled at the same time on each compute unit.
CL_DEVICE_LOCAL_MEM_SIZE is the maximum amount of local memory available per work group. In the context of your NVIDIA card, it is the amount of on die shared memory per multiprocessor - in this case 16kb which can be consumed by one or more work groups which will run on the multiprocessor.
精彩评论