Best way to generate a set of integers of size N, distributed like a normal distribution, given a mean and std. deviation
I'm looking for a way to generate a set of integers with a specified mean and std. deviation.
Using the random library, it is possible to generate a set of random doubles distributed in gaussian fashion, this would look something like this:
#include <tr1/random>
std::tr1::normal_distribution<double> normal(mean, stdDev);
std::开发者_如何学Pythontr1::ranlux64_base_01 eng;
eng.seed(1000);
for (int i = 0; i < N; i++)
{
gaussiannums[i] = normal(eng);
}
However, for my application, I need integers instead of doubles. So my question is, how would you generate the equivalent of the above but for integers instead of doubles? One possible path to take is to convert the doubles into integers in some fashion, but I don't know enough about how the random library works to know whether this can be done in a fashion that really preserves the bell shape and the mean/std. deviation.
I should mention that the goal here is not so much randomness, as it is to get a set of integers of a specific size, with the correct mean and std. deviation.
Ideally I would also like to specify the minimum and maximum values that can be produced, but I have not found any way to do this even for doubles, so any suggestions on this are also welcome.
This isn't possible.
The gaussian distribution is continuous, the set of integers is discrete.
The gaussian pdf has unlimited support, if you specify minimum and maximum you'll also have a different distribution.
What are you really trying to do? Is it only the mean and standard deviation that count? Other distributions have a well-defined mean and standard-deviation, including several discrete distributions.
For example, you could use a binomial distribution.
Solve the equations for mean and variance simultaneously to get p and n. Then generate samples from this distribution.
If n doesn't come out integer, you can use a multinomial distribution instead.
Although wikipedia describes methods for sampling from a binomial or multinomial distribution, they aren't particularly efficient. There's a method for efficiently generating samples from an arbitrary discrete distribution which you can use here.
In the comments, you clarified that you want a bell-shaped distribution with specific mean and standard deviation and bounded support. So we'll use the Gaussian as a starting point:
- compute a gaussian CDF across the range of integers you're interested in
- offset and scale it slightly to account for the missing tails (so it varies from 0 to 1)
- store it in an array
To sample from this distribution:
- generate uniform reals in the range [0:1]
- use binary search to invert the CDF
As the truncation step will reduce the standard deviation slightly (and affect the mean also, if the minimum and maximum aren't equidistant from the chosen mean) you may have to tweak the Gaussian parameters slightly beforehand.
精彩评论