What is the difference between float2 and cuComplex, which to use?
I am trying to figure out how to use complex numbers in both my host and device code. I came across cuComplex
(but can't find any documentation!) and float2
which at least gets a mention in the CUDA programming guide.
What should I use? In the header file for cuC开发者_运维技巧omplex
, it looks like the functions are declared with __host__ __device__
so I am assuming that means that it would be ok to use them in either place.
My original data is being read in from a file into a std::complex<float>
so I dont really want to mess with that. I guess in order to use the complex values on the GPU though, I will have to copy from the original complex<float>
to the cuComplex
?
cuComplex
is defined in /usr/local/cuda/include/cuComplex.h
(modulo your install dir). The relevant snippets:
typedef float2 cuFloatComplex;
typedef cuFloatComplex cuComplex;
typedef double2 cuDoubleComplex;
There are also handy functions in there for working with complex numbers -- multiplying them, building them, etc.
As for whether to use float2
or cuComplex
, you should use whichever is semantically appropriate -- is it a vector or a complex number? Also, if it is a complex number, you may want to consider using cuFloatComplex
or cuDoubleComplex
just to be fully explicit.
If you're trying to work with cuBLAS or cuFFT you should use cuComplex. If you're are going to write your own functions there should be no difference in performance as both are just a structure of two floats.
IIRC, float2 is an array of 2 numbers. cuComplex (from the name alone) sounds like CUDA's complex format.
This post seems to point to where to find more on cuComplex: http://forums.nvidia.com/index.php?showtopic=81514
精彩评论