CUDA 4.0 using pointers within kernels - error
my question is as follows:
I wish to use a kernel in two ways.
- I use an array
d_array
that has been copied over usingcudaMemcpy
, i.e. through
cutilSafeCall(cudaMemcpy(d_array, array, 100*sizeof(double),
cudaMemcpyHostToDevice));
Or
- I input a double
mydouble
directly i.e.double mydouble = 3;
If I input the array I simply use (which works fine):
kernel<<<1, 100>>>(d_array, 100, output);
If I input a double I use (which doesn't work fine!!!!):
kernel<<<1, 100>>>(&mydouble, 1, output);
My kernel is listed below:
___global___ void kernel(double * d_array, int size_d_array, double * output)
{
double a;
if (size_d_array == 100)
{output[threadIdx.x] = d_array[threadIdx.x];}
else
{output a[threadIdx.x] = d_array[0];} 开发者_运维问答
}
double aDouble = 3;
double *myDouble = &double;
If you do the above in host code, then myDouble is a pointer to host memory. That is why you can't pass it directly to a device kernel (a pointer is a pointer, whether it points to an array or a scalar value!).
However in CUDA 4.0 you can call cudaHostRegister on the host pointer and if your system supports unified virtual addressing, then you can pass it to the kernel. If it does not, then you can call cudaHostRegister with appropriate flags and then cudaHostGetDevicePointer to get a pointer you can pass to the device kernel. See the CUDA documentation on
精彩评论