Biased Random Number Generator
I am looking for a random number开发者_StackOverflow generator that can be biased. For instance, say I want a random number between 1-5, with the probability being:
1: Comes up 20% of the time
2: Comes up 10% of the time 3: Comes up 40% of the time 4: Comes up 25% of the time 5: Comes up 5% of the timeIs there anything in the standard library, or other libraries out there that would do this? Alternatively, is there an efficient way to do this myself?
For your problem, just pick a random element from this list uniformly:
[1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5]
In general, check this answer: Weighted random numbers
In TR1 and C++0x, there is <random>
header which contains the discrete_distribution
class to generate such numbers, among others.
You may also want to check out GSL which contains much more random distributions (and random number generators) than the standard <random>
library. (But note that GSL uses GPLv3.)
Best way's probably to just take the normal unbiased random generator then return based on the interval its value falls into.
Just an if statement that gives 1 for 0:0.2, 2 for 0.2:0.3, 3 for 0.3:0.7, 4 for 0.7:0.95 and 5 for 0.95:1. Best to make either the lower or upper limit of the interval inclusive and the other exclusive.
int biasedRandom(){
double i = randomNumber();
if(i<= 0.2){return 1;}
else if(i <= 0.3){return 2;}
else if(i <= 0.7){return 3;}
else if(i <= 0.95){return 4;}
else{return 5;}
}
Something like that.
The Boost random number library provides the ability to specify different shaped distributions for your generator. It's a great library - see http://www.boost.org/doc/libs/1_42_0/libs/random/index.html.
Coming late to the party on this one. Here is the C++0x answer:
#include <iostream>
#include <random>
#include <iterator>
int main()
{
// Set up distribution
double interval[] = {1, 2, 3, 4, 5, 6};
double weights[] = { .2, .1, .4, .25, .05};
std::piecewise_constant_distribution<> dist(std::begin(interval),
std::end(interval),
std::begin(weights));
// Choose generator
std::mt19937 gen; // seed as wanted
// Demonstrate by pouring into avg[rand-1]
const unsigned N = 1000000;
double avg[sizeof(weights) / sizeof(weights[0])] = {0};
for (unsigned i = 0; i < N; ++i)
avg[static_cast<unsigned>(dist(gen)) - 1]++;
// Comute averages
for (double* i = std::begin(avg); i < std::end(avg); ++i)
*i /= N;
// Display
for (unsigned i = 1; i <= sizeof(avg)/sizeof(avg[0]); ++i)
std::cout << "avg[" << i << "] = " << avg[i-1] << '\n';
}
Which for me outputs:
avg[1] = 0.199779
avg[2] = 0.100002
avg[3] = 0.400111
avg[4] = 0.250257
avg[5] = 0.049851
What you are describing is the implementation of a random number generator that draws from a particular probability distribution. For example, drawing numbers from a Gaussian distribution should draw random numbers such that the probability of a particular draw, x is proportional to
(source: wikimedia.org)
.
In general, the approach is to draw from a uniform random distribution and then pick the value of the desired distribution's cumulative distribution function (CDF) at that drawn location. In the case of a Normal Gaussian, draw a random number, x from a uniform distribution (this is what standard random number generators should give) and then choose
as the random, Gaussian distributed value. For your case, the CDF you describe is a piece-wise continuous stair-step function which could be implemented using any of the many (correct) answers you have already received.Of course, this is all trivia. What you should be doing is using a library that already handles this for you. Statistics and random number generation are not trivial and there's no need to re-invent the wheel. See Neil's answer (and check out the Boost random number library).
Why don't you just use a regular random number generator that return number between 0.0 and 1.0, and wrap it with another function that returns a number according to your requirements?
like
double biased (double seed) {
if (seed >= 0.0 && seed <0.2) return 1;
else if ...
}
Throw a random real number x in [0,1], if 0< x<0.2 return 1
, if 0.2<x <0.3 return 2
, etc.
See here for the general problem.
Kenny gave an appropriate answer tailored to your particular frequency distribution.
The more general answer works with a CDF - Cumulative Distribution Function - for the data, and uses a uniform random number to pick a value within the distribution.
I am doing to do the same thing and I found this: http://eli.thegreenplace.net/2010/01/22/weighted-random-generation-in-python/
Seems good enough for the purpose you stated.
#include <boost/random/discrete_distribution.hpp>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/variate_generator.hpp>
#include <iostream>
int main()
{
unsigned int seed = 42;
boost::mt19937 generator(seed);
// return 0 with probability 10%
// 1 40%
// 2 50%
boost::random::discrete_distribution<int> custom_dist{1,4,5};
boost::variate_generator<boost::mt19937&,
boost::random::discrete_distribution<int> > rndn(generator, custom_dist);
for (unsigned int i = 0; i<10000; i++) {
std::cout << rndn() << std::endl;
}
return 0;
}
And here is a plot of the result:
I was looking for something like this for TypeScript, but only found this question for C.
So here is a biased random number generator in TypeScript I came up with, in case anybody needs something like this in TypeScript. I am sure you can translate it to C somehow.
export async function weightedRandomItem<T>(list: { weight: number; item: T }[]): Promise<T> {
const weightSum = sumBy(list, (item) => item.weight)
const randomIndex = await randomIntegerBetween(1, weightSum)
let currentIndex = 1
for (const listItem of list) {
if (randomIndex >= currentIndex && randomIndex < currentIndex + listItem.weight) {
return listItem.item
}
currentIndex += listItem.weight
}
throw new Error("No item selected. Impossible.")
}
where randomIntegerBetween(minInclusive: number, maxInclusive: number)
returns a random integer from the specified range (min and max inclusive) from the RNG of your choice.
sumBy()
is the lodash function in this case, and it should be self-explanatory.
As input you could pass in something like:
[{
weight: 10,
item: 1,
},
{
weight: 50,
item: 2,
},
{
weight: 30,
item: 3,
},
{
weight: 10,
item: 4,
}]
Then, the result would most probably be 2
.
精彩评论