Distinct rand() sequences yielding the same results in an expression
Ok, this is a really weird one.
I have an MPI program, where each process has to generate random numbers in a fixed range (the range is read from file). What happens is that even though I seed each process with a different value, and the numbers generated by rand()
are different in each process, the expression to generate the random numbers still yields the same sequence between them.
Here's all relevant code:
// 'rank' will be unique for each process
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// seed the RNG with a different value for each process
srand(time(NULL) + rank);
// print some random numbers to see if we get a unique sequence in each process
// 'log' is a uniquely named file, each process has its own
log << rand() << " " << rand() << " " << rand() << std::endl;
// do boring deterministic stuff
while (true)
{
// waitTimeMin and waitTimeMax are integers, Max is always greater than Min
waitSecs = waitTimeMin + rand() % (waitTimeMax - waitTimeMin);
log << "waiting " << waitSecs << " seconds" << std::endl;
sleep(waitSecs);
// do more boring deterministic stuff
}
Here's the output of each process, with 3 processes generating numbers in the range [1,9].
process开发者_如何转开发 1:
15190 28284 3149
waiting 6 seconds
waiting 8 seconds
waiting 9 seconds
waiting 4 seconds
process 2:
286 6264 3153
waiting 6 seconds
waiting 8 seconds
waiting 9 seconds
waiting 4 seconds
process 3:
18151 17013 3156
waiting 6 seconds
waiting 8 seconds
waiting 9 seconds
waiting 4 seconds
So while rand()
clearly generates different numbers, the expression to calculate waitSecs
still evaluates to the same sequence on all processes. What's even weirder: if I run the program with the same parameteres again, only the first 3 random numbers will change, the rest of the "random" sequence will be exactly the same in each run! Changing the range of numbers will obviously produce a different result from this one, but the same parameters always yield the same sequence, between processes and between executions: except for the first 3 numbers.
Just what the hell is going on here?
EDIT: So just to see if it's the simplistic random generation and/or low range, I replaced the random generation with this line:
waitSecs = waitTimeMin + (int)((double)rand() / ((double)RAND_MAX + 1) * (waitTimeMax - waitTimeMin));
And started generating numbers in the range [1,99]. Here's the result:
process 1:
7833 3798 10977
waiting 1 seconds
waiting 20 seconds
waiting 58 seconds
waiting 35 seconds
waiting 82 seconds
waiting 18 seconds
process 2:
25697 14547 10980
waiting 1 seconds
waiting 20 seconds
waiting 58 seconds
waiting 35 seconds
waiting 82 seconds
waiting 18 seconds
process 3:
10794 25295 10984
waiting 1 seconds
waiting 20 seconds
waiting 58 seconds
waiting 35 seconds
waiting 82 seconds
waiting 18 seconds
Same thing. Can this still be just rand()
being really bad?
EDIT2: Same thing when generating numbers from 1 to 10000.
In your code you are only using 3 lower bits if the generated random number (remainder of division by 8). What your experiment shows is that the sequence of the lowest 3 bits of the generated number sequence is the same every time. This is perfectly possible. This is, in fact, a well-known issue with the simplistic pseudo-random number generator usually used to implement rand()
.
If you'd like to use rand()
(instead of a more complicated custom generator), better use higher-order bits, not lower-order ones. I.e. don't use the %
operator to reduce the range of rand()
. Take a look here for a better method: http://c-faq.com/lib/randrange.html
Computing (rand() % n)
is usually a bad idea - you get results that are less than random. Instead, if RAND_MAX
is the range of output of rand()
, try dividing rand()
by (RAND_MAX/(waitTimeMax - waitTimeMin))
.
The rand()
you are using is likely a linear congruential generator. If you follow the latter link, you'll find more information on how it works, along with an explanation of why the lower bits are "less random" than the higher bits.
Ok, apparently I'm retarded. After initializing the RNG, I spawned a new thread and generated the random numbers there, without initialization. Calling srand()
in the new thread fixed the problem. So yeah, the lesson here is that srand()
and rand()
work per thread, not per process. I also need to start posting more information about my program in my questions.
Ouch.
Sorry for wasting everyone's time.
Try using a random number generator from tr1 like std::tr1::mt19937
. The rand()
function is usually implemented with a low quality random number generator.
EDIT: Low quality may mean, for example, that even generating 2D points (x,y)
in [0,100]^2
will lead to points that are not evenly scattered in the square. You may think that it's not supposed to behave that badly, but you will be surprised by how badly plain rand()
actually behaves.(and this is sadly true in most languages).
EDIT2: The method range*(rand()/RAND_MAX)
is not a good one. It has issues with double precision that will not yield even results.
Try the following and see if your program still gives you the results you are surprised of:
std::tr1::mt19937 engine(thread_seed);
std::tr1::uniform_int<> unigen(waitTimeMin, waitTimeMax);
std::tr1::variate_generator<std::tr1::mt19937,
std::tr1::uniform_int<> >gen(engine, unigen);
waitSec = gen();
EDIT3:
(source: dilbert.com)
精彩评论