Using Python for quasi randomization
Here's the problem: I try to randomize n times a choice between two elements (let's say [0,1] -> 0 or开发者_开发百科 1), and my final list will have n/2 [0] + n/2 [1]. I tend to have this kind of result: [0 1 0 0 0 1 0 1 1 1 1 1 1 0 0, until n]: the problem is that I don't want to have serially 4 or 5 times the same number so often. I know that I could use a quasi randomisation procedure, but I don't know how to do so (I'm using Python).
To guarantee that there will be the same number of zeros and ones you can generate a list containing n/2 zeros and n/2 ones and shuffle it with random.shuffle
.
For small n, if you aren't happy that the result passes your acceptance criteria (e.g. not too many consecutive equal numbers), shuffle again. Be aware that doing this reduces the randomness of the result, not increases it.
For larger n it will take too long to find a result that passes your criteria using this method (because most results will fail). Instead you could generate elements one at a time with these rules:
- If you already generated 4 ones in a row the next number must be zero and vice versa.
- Otherwise, if you need to generate x more ones and y more zeros, the chance of the next number being one is x/(x+y).
You can use random.shuffle
to randomize a list.
import random
n = 100
seq = [0]*(n/2) + [1]*(n-n/2)
random.shuffle(seq)
Now you can run through the list and whenever you see a run that's too long, swap an element to break up the sequence. I don't have any code for that part yet.
Having 6 1's in a row isn't particularly improbable -- are you sure you're not getting what you want?
There's a simple Python interface for a uniformly distributed random number, is that what you're looking for?
Here's my take on it. The first two functions are the actual implementation and the last function is for testing it.
The key is the first function which looks at the last N elements of the list where N+1
is the limit of how many times you want a number to appear in a row. It counts the number of ones that occur and then returns 1 with (1 - N/n) probability where n
is the amount of ones already present. Note that this probability is 0 in the case of N
consecutive ones and 1 in the case of N
consecutive zeros.
Like a true random selection, there is no guarantee that the ratio of ones and zeros will be the 1 but averaged out over thousands of runs, it does produce as many ones as zeros.
For longer lists, this will be better than repeatedly calling shuffle
and checking that it satisfies your requirements.
import random
def next_value(selected):
# Mathematically, this isn't necessary but it accounts for
# potential problems with floating point numbers.
if selected.count(0) == 0:
return 0
elif selected.count(1) == 0:
return 1
N = len(selected)
selector = float(selected.count(1)) / N
if random.uniform(0, 1) > selector:
return 1
else:
return 0
def get_sequence(N, max_run):
lim = min(N, max_run - 1)
seq = [random.choice((1, 0)) for _ in xrange(lim)]
for _ in xrange(N - lim):
seq.append(next_value(seq[-max_run+1:]))
return seq
def test(N, max_run, test_count):
ones = 0.0
zeros = 0.0
for _ in xrange(test_count):
seq = get_sequence(N, max_run)
# Keep track of how many ones and zeros we're generating
zeros += seq.count(0)
ones += seq.count(1)
# Make sure that the max_run isn't violated.
counts = [0, 0]
for i in seq:
counts[i] += 1
counts[not i] = 0
if max_run in counts:
print seq
return
# Print the ratio of zeros to ones. This should be around 1.
print zeros/ones
test(200, 5, 10000)
Probably not the smartest way, but it works for "no sequential runs", while not generating the same number of 0s and 1s. See below for version that fits all requirements.
from random import choice
CHOICES = (1, 0)
def quasirandom(n, longest=3):
serial = 0
latest = 0
result = []
rappend = result.append
for i in xrange(n):
val = choice(CHOICES)
if latest == val:
serial += 1
else:
serial = 0
if serial >= longest:
val = CHOICES[val]
rappend(val)
latest = val
return result
print quasirandom(10)
print quasirandom(100)
This one below corrects the filtering shuffle idea and works correctly AFAICT, with the caveat that the very last numbers might form a run. Pass debug=True
to check that the requirements are met.
from random import random
from itertools import groupby # For testing the result
try: xrange
except: xrange = range
def generate_quasirandom(values, n, longest=3, debug=False):
# Sanity check
if len(values) < 2 or longest < 1:
raise ValueError
# Create a list with n * [val]
source = []
sourcelen = len(values) * n
for val in values:
source += [val] * n
# For breaking runs
serial = 0
latest = None
for i in xrange(sourcelen):
# Pick something from source[:i]
j = int(random() * (sourcelen - i)) + i
if source[j] == latest:
serial += 1
if serial >= longest:
serial = 0
guard = 0
# We got a serial run, break it
while source[j] == latest:
j = int(random() * (sourcelen - i)) + i
guard += 1
# We just hit an infinit loop: there is no way to avoid a serial run
if guard > 10:
print("Unable to avoid serial run, disabling asserts.")
debug = False
break
else:
serial = 0
latest = source[j]
# Move the picked value to source[i:]
source[i], source[j] = source[j], source[i]
# More sanity checks
check_quasirandom(source, values, n, longest, debug)
return source
def check_quasirandom(shuffled, values, n, longest, debug):
counts = []
# We skip the last entries because breaking runs in them get too hairy
for val, count in groupby(shuffled):
counts.append(len(list(count)))
highest = max(counts)
print('Longest run: %d\nMax run lenght:%d' % (highest, longest))
# Invariants
assert len(shuffled) == len(values) * n
for val in values:
assert shuffled.count(val) == n
if debug:
# Only checked if we were able to avoid a sequential run >= longest
assert highest <= longest
for x in xrange(10, 1000):
generate_quasirandom((0, 1, 2, 3), 1000, x//10, debug=True)
精彩评论