Fast random selection algorithm
Given an array of true/false values, what is the most efficient algorithm to select an index with a true value at random.
A sketch simple algorithm is
a <- the 开发者_StackOverflow中文版array
c <- 0
for i in a:
if a[i] is true: c++
e <- random number in (0, c-1)
j <- 0
for i in e:
while j is false: j++
return j
Can anyone come up with a faster algorithm? Maybe there is a way to only walk through the list once even if the number of true elements is not known at first?
Use the "pick a random element from an infinite list" algorithm.
Keep an index of your current pick, and also a count of how many true values you've seen.
When you see a true value, increment the count and then replace your pick with the current index with a probability of P=(1/count). (So you always pick the first one you find... then you might switch to the second one, with probability 1/2, then you might switch to the third one with probabilty 1/3 etc.)
This requires only one scan over the list and constant storage. (It does require you to work out a larger number of random numbers, however.) In particular, it doesn't ever require you to either buffer the list or go back to the start - so it can work on an unbounded input stream.
See this answer for a sample LINQ implementation of the simple "pick a random element" algorithm; it would just need minor tweaks.
Build a list with indexes that point to true
values and select one of those at random. Requires O(n) for list traversal and one try for the random number.
精彩评论