开发者

Using Python for quasi randomization

Here's the problem: I try to randomize n times a choice between two elements (let's say [0,1] -> 0 or开发者_开发百科 1), and my final list will have n/2 [0] + n/2 [1]. I tend to have this kind of result: [0 1 0 0 0 1 0 1 1 1 1 1 1 0 0, until n]: the problem is that I don't want to have serially 4 or 5 times the same number so often. I know that I could use a quasi randomisation procedure, but I don't know how to do so (I'm using Python).


To guarantee that there will be the same number of zeros and ones you can generate a list containing n/2 zeros and n/2 ones and shuffle it with random.shuffle.

For small n, if you aren't happy that the result passes your acceptance criteria (e.g. not too many consecutive equal numbers), shuffle again. Be aware that doing this reduces the randomness of the result, not increases it.

For larger n it will take too long to find a result that passes your criteria using this method (because most results will fail). Instead you could generate elements one at a time with these rules:

  • If you already generated 4 ones in a row the next number must be zero and vice versa.
  • Otherwise, if you need to generate x more ones and y more zeros, the chance of the next number being one is x/(x+y).


You can use random.shuffle to randomize a list.

import random
n = 100
seq = [0]*(n/2) + [1]*(n-n/2)
random.shuffle(seq)

Now you can run through the list and whenever you see a run that's too long, swap an element to break up the sequence. I don't have any code for that part yet.


Having 6 1's in a row isn't particularly improbable -- are you sure you're not getting what you want?

There's a simple Python interface for a uniformly distributed random number, is that what you're looking for?


Here's my take on it. The first two functions are the actual implementation and the last function is for testing it.

The key is the first function which looks at the last N elements of the list where N+1 is the limit of how many times you want a number to appear in a row. It counts the number of ones that occur and then returns 1 with (1 - N/n) probability where n is the amount of ones already present. Note that this probability is 0 in the case of N consecutive ones and 1 in the case of N consecutive zeros.

Like a true random selection, there is no guarantee that the ratio of ones and zeros will be the 1 but averaged out over thousands of runs, it does produce as many ones as zeros. For longer lists, this will be better than repeatedly calling shuffle and checking that it satisfies your requirements.

import random


def next_value(selected):

    # Mathematically, this isn't necessary but it accounts for
    # potential problems with floating point numbers.
    if selected.count(0) == 0:
        return 0
    elif selected.count(1) == 0:
        return 1

    N = len(selected)
    selector = float(selected.count(1)) / N

    if random.uniform(0, 1) > selector:
        return 1
    else:
        return 0


def get_sequence(N, max_run):
    lim = min(N, max_run - 1)
    seq = [random.choice((1, 0)) for _ in xrange(lim)]

    for _ in xrange(N - lim):
        seq.append(next_value(seq[-max_run+1:]))
    return seq


def test(N, max_run, test_count):
    ones = 0.0
    zeros = 0.0

    for _ in xrange(test_count):
        seq = get_sequence(N, max_run)

        # Keep track of how many ones and zeros we're generating
        zeros += seq.count(0)
        ones += seq.count(1)

        # Make sure that the max_run isn't violated.
        counts = [0, 0]
        for i in seq:
            counts[i] += 1
            counts[not i] = 0
            if max_run in counts:
                print seq
                return


    # Print the ratio of zeros to ones. This should be around 1.
    print zeros/ones


test(200, 5, 10000)


Probably not the smartest way, but it works for "no sequential runs", while not generating the same number of 0s and 1s. See below for version that fits all requirements.

from random import choice
CHOICES = (1, 0)

def quasirandom(n, longest=3):
  serial = 0
  latest = 0
  result = []
  rappend = result.append

  for i in xrange(n):
    val = choice(CHOICES)
    if latest == val:
      serial += 1
    else:
      serial = 0
    if serial >= longest:
      val  = CHOICES[val]
    rappend(val)
    latest = val
  return result

print quasirandom(10)
print quasirandom(100)

This one below corrects the filtering shuffle idea and works correctly AFAICT, with the caveat that the very last numbers might form a run. Pass debug=True to check that the requirements are met.

from random import random
from itertools import groupby # For testing the result
try: xrange
except: xrange = range

def generate_quasirandom(values, n, longest=3, debug=False):
  # Sanity check
  if len(values) < 2 or longest < 1:
    raise ValueError

  # Create a list with n * [val]
  source = []
  sourcelen = len(values) * n
  for val in values:
    source += [val] * n

  # For breaking runs
  serial = 0
  latest = None

  for i in xrange(sourcelen):
    # Pick something from source[:i]
    j = int(random() * (sourcelen - i)) + i
    if source[j] == latest:
      serial += 1
      if serial >= longest:
        serial = 0
        guard = 0
        # We got a serial run, break it
        while source[j] == latest:
          j = int(random() * (sourcelen - i)) + i
          guard += 1
          # We just hit an infinit loop: there is no way to avoid a serial run
          if guard > 10:
            print("Unable to avoid serial run, disabling asserts.")
            debug = False
            break
    else:
      serial = 0
    latest = source[j]
    # Move the picked value to source[i:]
    source[i], source[j] = source[j], source[i]

  # More sanity checks
  check_quasirandom(source, values, n, longest, debug)

  return source


def check_quasirandom(shuffled, values, n, longest, debug):
  counts = []
  # We skip the last entries because breaking runs in them get too hairy
  for val, count in groupby(shuffled):
    counts.append(len(list(count)))
  highest = max(counts)
  print('Longest run: %d\nMax run lenght:%d' % (highest, longest))

  # Invariants
  assert len(shuffled) == len(values) * n
  for val in values:
    assert shuffled.count(val) == n

  if debug:
    # Only checked if we were able to avoid a sequential run >= longest
    assert highest <= longest

for x in xrange(10, 1000):
  generate_quasirandom((0, 1, 2, 3), 1000, x//10, debug=True)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜