开发者

CUDA Array/Surface Memory

After calling the function test, I print the dtr1 array. I am expecting to get 100 for all the elements, but I am not getting it. why is that?

#include "ImageUtil2D.h"
#define W 10
#define H 10
#define MAX 100000
#define No_THREADS 10
surface<vo开发者_如何学编程id,2> surfD;

__global__ void test()
{
for(int i=0;i<W;i++)
    for(int j=0;j<H;j++)
    {
        float a=100;
        surf2Dwrite(a, surfD, i,j, cudaBoundaryModeTrap);
    }
}

int main()
{
int *image = new int[W*H];
float *dtr = new float[W*H];
ImageUtil2D::InitImg(image, dtr, W, H);
const size_t sizef = size_t(W*H)*sizeof(float);

cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat);
cudaArray* cuArrD;
cudaMallocArray(&cuArrD, &channelDesc, W*H, 0, cudaArraySurfaceLoadStore);
//cudaMemcpyToArray(cuArrD, 0, 0, dtr, sizef, cudaMemcpyHostToDevice);
cudaBindSurfaceToArray(surfD, cuArrD);

test<<<1, 1>>>();

float *dtr1=new float[W*H];
cudaMemcpyFromArray(&dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );
ImageUtil2D::Print(dtr1);
return 0;
}


CUDA C Programming Guide 3.2. Section: 3.2.4.2.2 Surface Binding

Unlike texture memory, surface memory uses byte addressing. This means that the x-coordinate used to access a texture element via texture functions needs to be multiplied by the byte size of the element to access the same element via a surface function.

Try this:

surf2Dwrite(a, surfD, i * 4, j, cudaBoundaryModeTrap);

Hope this help.

Suggestion: Read the whole chapter about Surface Memory or you will get Read/Write Coherency problems before you excepted ;)


The additional issue pointed out by pQB in the comment to his own answer on

cudaMemcpyFromArray(&dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );

can be fixed by changing the above line to

cudaMemcpyFromArray(dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜