How good is the RAND() function in Excel for Monte Carlo simulation?
I'm implementing a Monte Carlo simulation in 3 variables in Excel. I've used the RAND() function to sample from Weibull distributions (with long tails). The functions applied to the samples are non-linear but smooth (exp, ln, cos, etc). The result for each sample is a pass/fail, and the overall result is a probability of failure.
I have also implemented this by both numerical integration and Monte Carlo in MathCad, getting the same result both times. MathCad uses (I think) a Mersenne Twister random number generator.
My excel spreadsheet is getting consistently different results (ie always larger). I have checked the equations are the same.
What random number generator does Excel use, and how good is it? Is it possible that th开发者_Go百科is is the source of my problem? I have assumed the Excel implementations of exp, cos etc are ok.
Finally, is there a way to implement Monte Carlo to mitigate against the (known) poor properties of a particular random number generator? (I've heard of Markov chains, random walks etc, but don't really know much about them)
Many thanks.
Since this is the top result in Google for "how good is Excel's RAND() function" it is worth updating the answers for later versions of Excel
This paper by Guy Melard "On the accuracy of statistical procedures in Microsoft Excel 2010" tested the RAND() function in Excel 2010 and found it to be substantially improved over 2007 or 2003. Microsoft switched from an incorrect Wichmann and Hill generator (2007/2003) to the Mersenne Twister algorithm which has a much, much greater cycle length.
The authors of that paper ran it through "Small Crush", "Crush" and "Big Crush" tests for randomness and it passed nearly all of the tests.
So while it certainly isn't the same as True random numbers, the RAND() function in Excel 2010, and presumably newer versions, can no longer be considered terrible.
It should be noted however, that Excel 2010 still uses two completely different algorithms for the VBA random number generator, and the RNG that is in the data analysis tool-kit. According to Melard, both of those are still terrible, and in fact the VBA uses the same seed number each each time so produces the same numbers.
My biggest complaints with the random numbers in Excel are
- You can't set the seed, so the numbers are not reproducible
- The random numbers update every time you press enter/delete, and even if you set calculation options to Manual, they still update when you save the Excel file
There is a journal paper on this topic by McCullough (2008): On the accuracy of statistical procedures in Microsoft Excel 2007 (Computational Statistics and Data Analysis)
Quoting the original article:
The random number generator has always been inadequate. With Excel 2003, Microsoft attempted to implement the Wichmann–Hill generator and failed to implement it correctly. The fixed version appears in Excel 2007 but this fix was done incorrectly. Microsoft has twice failed to implement correctly the dozen lines of code that constitute the Wichmann–Hill generator; this is something that any undergraduate computer science major should be able to do. The Excel random number generator does not fulfill the basic requirements for a random number generator to be used for scientific purposes:
- it is not known to pass standard randomness tests, e.g., L’Ecuyer and Simard’s (2007) CRUSH tests (these supersede Marsaglia’s (1996) DIEHARD tests—see Altman et al. (2004) for a comparison);
- it is not known to produce numbers that are approximately independent in a moderate number of dimensions;
- it has an unknown period length; and
- it is not reproducible.
For further discussion of these points, see the accompanying article by McCullough (2008); the performance of Excel 2007 in this area is inadequate.
Paul Wilmott, in his Quantitative Finance book, simply adds up the results of 12 calls to RAND() and subtracts 6 for a good approximation to a Normal variable. Quick n Dirty
There are commercial products for this. Google turns up two before I got bored of looking
http://www.mathwave.com/articles/random-numbers-excel-worksheets.html
http://www.ozgrid.com/Services/excel-random-number-generator.htm
RAND()
is quite random, but for Monte Carlo simulations, may be a little too random (unless your doing primality testing). Most Monte Carlo simulations just require pseudo-random and deterministic sequences. As part of the Excel Analysis ToolPak RANDBETWEEN()
may be all you need for pseudo-random sequences.
精彩评论