I looked through the programming guide and best practices guide and it mentioned that Global Memory access takes 400-600 cycles. I did not see much on the other memory types like texture cache, consta
My programming experience is about 1 year of C/C++ experience from high school, but I did my research and wrote a simple program with OpenCL a few months ago. I was able to compile and run this on an
I\'m looking for a good OpenCL wra开发者_运维技巧pper\\library for Python, with good documentation. I tried to search some... but couldn\'t find one good enough.The most popular and best documented op
It seems like 2 million floats should be no big deal, only 8MBs of 1GB of GPU RAM. I am able to allocate that much at times and sometimes more than that with no trouble. I get CL_OUT_OF_RESOURCES when
I just wanted to ask, if somebody can give me a heads up on what to pay attention to when using several simple kernels after each other.
Is it ok to use both OpenGL an OpenCL in one program? Both operate on GPU and I\'m afraid how switching between OpenCL and OpenGL is handled in \"background\" (e.g. registers are 开发者_运维知识库over
I used the CL_MEM_ALLOC_HOST_PTR flag with my clCreateBuffer calls, but the Compute Profiler shows all my \"host mem transfer type\" as being Pageable. I tried it in two different kernel setups, but t
If my algorithm is bottlenecked by host to device and device to host memory tr开发者_StackOverflowansfers, is the only solution a different or revised algorithm?There are a couple things you can try t
Is开发者_运维技巧 it possible to compare more than two kernel executions at a time in Compute Prof?Yes, if they are ran by the same process. They will show up in the final log file.
Searching through the NVIDIA forums I found these questions, which are also of interest to me, but nobody had answered them in the last four days or so. Can you help?