I\'ve been working on setting up a \'smart paging\' library for my OpenCL projects. Basically this involves checking:
My algorithm consists from two steps: Data generation. On this step I generate data array in cycle as some function result
I\'m looking for some 1D problems in CUDA and HPC, e.g. Black Scholes. By 1D problems, I mean problems in which all the work is done on 1D arrays. Although matrix multiplication can be expressed in
Today I added four more __local variables to my kernel to dump intermediate results in. But just adding the four more variables to the kernel\'s signature and adding the corresponding Kernel arguments
A warp is 32 threads. Does the 32 threads execute in paral开发者_如何转开发lel in a Multiprocessor?
I know that devices before the Fermi architecture h开发者_如何学编程ad 8 SPs in a single multiprocessor. Is the count same in Fermi architecture?The answer depends on the Compute Capability property o
specs: Radeon 3870HD w/ openGL 3.3 & GLSL 1.5 I am rendering data through computational shader. Because of dependenci开发者_开发技巧es I had to put all my data to uniform textures and nothing lef
Bicubic interpolation is one of the common interpolation method, but I can not find any working implementation on OpenCL. I was decided to write bicubic interpolation on OpenCL myself, but ...
In Wikipedia and other sources\' description of OpenGL 4.0 I read about this feature: Drawing of data generated by OpenGL or external APIs such as OpenCL, without CPU intervention.
I have implemented a Matrix datatype in C++ by using 1D datatype and wrapping it into rows and columns. Now, I want to have this possibility to c开发者_开发问答reate square/blocked sub-matrices from t