gpgpu_开发者

开发者

gpgpu

相关标签：javascript jquery android 多少钱 iPhone

In OpenCL, what does mem_fence() do, as opposed to barrier()?
Unlike barrier() (which I think I understand), mem_fence() does not affect all items in the work group.The OpenCL spec says (section 6.11.10), for mem_fence():
问答阅读(8)
How To Structure Large OpenCL Kernels?
I have worked with OpenCL on a couple of projects, but have always written the kernel 开发者_运维百科as one (sometimes rather large) function.Now I am working on a more complex project and would like
问答阅读(7)
Using int index where double is expected in C++ AMP retrict(direct3d) code
Googling didn’t help much, has anyone used AMP? In the code snippet below the cast from integer to double (double v = idx.x) leads to a “Failed to create shader” run time error.
问答阅读(6)
How good is NVCC at code optimizations?
How well does NVCC optimize device code? Does it do any sort of optimizations like constant folding and common subexpression elimination?
问答阅读(2)
Can you predict the runtime of a CUDA kernel?
To what degree can one predict / calculate the performanc开发者_开发百科e of a CUDA kernel? Having worked a bit with CUDA, this seems non trivial.
问答阅读(4)
cuda multiple memory access
Please give me some explanation how a memory access works in the following kernel: __global__ void kernel(float4 *a)
问答阅读(4)
How to quickly find a image in another image using CUDA?
In my current project I need to find pixel exact position of image contained in another image of larger size. Smaller image is never rotated or stretched (so should match pixel by pixel) but it may ha
问答阅读(5)
cuda nbody simulation - shared memory problem
Based on the example from Nvidia GPU computing SDK I created two kernels for the nbody simulation. The first kernel which doesn\'t take advantage of shared memory is ~15% faster than the second kernel
问答阅读(3)
OpenCL vs OpenMP performance [closed]
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this po
问答阅读(3)
CUDA kernel function taking longer than equivalent host function
I\'m following along with http://code.google.com/p/stanford-cs193g-sp2010/ and the video lectures posted online, doing one of the problem sets posted (the first one) I\'ve encountered something slight
问答阅读(8)

首页上一页第1页下一页共13页