local mem on opencl hardware
I've been wondering, is there a way to estimate the mount of shared mem on the different GPGPU's without going out and buying the cards?
I currently have a GTS 330M with 16K shared mem in my l开发者_运维技巧aptop and a GTX 480 with 16K + 32K = 48K shared mem.
I would like to know if getting a tesla card would give me more shared mem pr block of if it would be the same as the GTX card.
How does one figure this out? I'm not able to look it up in the specs on nvidia's site ... perhaps a AMD GPGPU would be better, how does one figure this out?
I hope someone can help
For NVIDIA hardware, the shared memory configuration of every CUDA/OpenCL capable card is described in Appendix F of the CUDA 4.0 programming guide.
To answer your question about a Ferm Telsa card, it has the same shared memory configuration as your GTX 480 - 16kb or 48kb of shared memory, user selectable at runtime.
As the data needs to be transferred over the PCI-e bus, the global memory on another device is as slow as using the memory of the CPU. If your input-data cannot be split up and therefore memory is a bigger bottle-neck than speed, try to use OpenCL on a vector-enabled CPU like Intel SandyBridge or AMD Fusion.
Run the JavaCL hardware report, http://nativelibs4java.sourceforge.net/webstart/OpenCL/HardwareReport.jnlp ?
精彩评论