开发者

Generating random numbers from a gaussian distribution in CUDA

I've searched a lot over the internet to find a way to generate random numbers on my CUDA device, within a kernel. The numbers must come from a gaussian distribution.

The best thing I found was from NVIDIA itself. It is the Wallace algorithm, tha开发者_运维百科t uses a uniform distribution to build a gaussian one. But the code samples they give lack explanation and I really need to understand how the algorithm goes, especially on the device. For example, they give:

 __device__ void generateRandomNumbers_wallace(  
unsigned seed,  // Initialization seed  
 float *chi2Corrections,  // Set of correction values  
 float *globalPool,  // Input random number pool  
 float *output  // Output random numbers  


    unsigned tid=threadIdx.x;  
    // Load global pool into shared memory.  
     unsigned offset = __mul24(POOL_SIZE, blockIdx.x);  
    for( int i = 0; i < 4; i++ )  
      pool[tid+THREADS*i] = globalPool[offset+TOTAL_THREADS*i+tid];  
    __syncthreads();  
      const unsigned lcg_a=241;  
      const unsigned lcg_c=59;  
      const unsigned lcg_m=256;  
      const unsigned mod_mask = lcg_m-1;  
      seed=(seed+tid)&mod_mask ;  
      // Loop generating outputs repeatedly  
     for( int loop = 0; loop < OUTPUTS_PER_RUN; loop++ )  
      {  
        Transform();  
        unsigned intermediate_address;  
        i_a = __mul24(loop,8*TOTAL_THREADS)+8*THREADS *  
          blockIdx.x + threadIdx.x;  
        float chi2CorrAndScale=chi2Corrections[  
          blockIdx.x * OUTPUTS_PER_RUN + loop];  
        for( i = 0; i < 4; i++ )  
          output[i_a + i*THREADS]=chi2CorrAndScale*pool[tid+THREADS*i];  
    }  

First of all, many of the variables declared aren't even used in the function! And I really don't get what the "8" is for in the second loop. I understand the "4" in the other loops have something to do with the 4x4 orthogonal matrix block, am I right? Could anyone give me a better idea of what is going on here?

Anyway, does anyone have any good code samples I could use? Or does anyone have another way of generating random gaussian numbers in a CUDA kernel? Code samples will be much appreciated.

Thanks!


You could use CURAND, which is included with the CUDA Toolkit (version 3.2 and later). It'd be far simpler!

A few notes on the code you posted:

  • The Wallace generator transforms Gaussian to Gaussian (i.e. not Uniform to Gaussian)
  • CUDA code has two implicit variables: blockIdx and threadIdx - these define the block index and thread index with a block, see the CUDA Programming Guide for more information
  • The code uses __mul24, on sm_20 and later this is actually slower than "ordinary" 32-bit multiplication so I would avoid it (even on older architectures for simplicity)


The Box-Muller method is also good.


The Fast Walsh Hadamard transform is done by patterns of addition and subtraction. Hence the central limit theorem applies. An array of uniform random numbers that undergoes a Walsh Hadamard transformation will have a Gaussian/Normal distribution. There are some slight technical details about that. The algorithm was not discovered by Wallace. It was first published in Servo Magazine around 1993/1994 by myself. I have code about the Walsh Hadamard transform at www.code.google.com/p/lemontree Regards, Sean O'Connor

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜