开发者

algorithm for python itertools.permutations

Can someone please explain algorithm for itertools.permutations routine in Python standard lib 2.6? I don't understand why it works.

Code is:

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = range(n)
    cycles = range(n, n-r, -1)
    yield tuple(pool[i] for i in indices[:r])
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[开发者_StackOverflow-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return


You need to understand the mathematical theory of permutation cycles, also known as "orbits" (it's important to know both "terms of art" since the mathematical subject, the heart of combinatorics, is quite advanced, and you may need to look up research papers which could use either or both terms).

For a simpler introduction to the theory of permutations, wikipedia can help. Each of the URLs I mentioned offers reasonable bibliography if you get fascinated enough by combinatorics to want to explore it further and gain real understanding (I did, personally -- it's become somewhat of a hobby for me;-).

Once you understand the mathematical theory, the code is still subtle and interesting to "reverse engineer". Clearly, indices is just the current permutation in terms of indices into the pool, given that the items yielded are always given by

yield tuple(pool[i] for i in indices[:r])

So the heart of this fascinating machinery is cycles, which represents the permutation's orbits and causes indices to be updated, mostly by the statements

j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]

I.e., if cycles[i] is j, this means that the next update to the indices is to swap the i-th one (from the left) with the j-th one from the right (e.g., if j is 1, then the last element of indices is being swapped -- indices[-1]). And then there's the less frequent "bulk update" when an item of cycles reached 0 during its decrements:

indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i

this puts the ith item of indices at the very end, shifting all following items of indices one to the left, and indicates that the next time we come to this item of cycles we'll be swapping the new ith item of indices (from the left) with the n - ith one (from the right) -- that would be the ith one again, except of course for the fact that there will be a

cycles[i] -= 1

before we next examine it;-).

The hard part would of course be proving that this works -- i.e., that all permutations are exhaustively generated, with no overlap and a correctly "timed" exit. I think that, instead of a proof, it may be easier to look at how the machinery works when fully exposed in simple cases -- commenting out the yield statements and adding print ones (Python 2.*), we have

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = range(n)
    cycles = range(n, n-r, -1)
    print 'I', 0, cycles, indices
    # yield tuple(pool[i] for i in indices[:r])
    print indices[:r]
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
        print 'B', i, cycles, indices
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
        print 'A', i, cycles, indices
            else:
        print 'b', i, cycles, indices
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
        print 'a', i, cycles, indices
                # yield tuple(pool[i] for i in indices[:r])
            print indices[:r]
                break
        else:
            return

permutations('ABC', 2)

Running this shows:

I 0 [3, 2] [0, 1, 2]
[0, 1]
b 1 [3, 1] [0, 1, 2]
a 1 [3, 1] [0, 2, 1]
[0, 2]
B 1 [3, 0] [0, 2, 1]
A 1 [3, 2] [0, 1, 2]
b 0 [2, 2] [0, 1, 2]
a 0 [2, 2] [1, 0, 2]
[1, 0]
b 1 [2, 1] [1, 0, 2]
a 1 [2, 1] [1, 2, 0]
[1, 2]
B 1 [2, 0] [1, 2, 0]
A 1 [2, 2] [1, 0, 2]
b 0 [1, 2] [1, 0, 2]
a 0 [1, 2] [2, 0, 1]
[2, 0]
b 1 [1, 1] [2, 0, 1]
a 1 [1, 1] [2, 1, 0]
[2, 1]
B 1 [1, 0] [2, 1, 0]
A 1 [1, 2] [2, 0, 1]
B 0 [0, 2] [2, 0, 1]
A 0 [3, 2] [0, 1, 2]

Focus on the cycles: they start as 3, 2 -- then the last one is decremented, so 3, 1 -- the last isn't zero yet so we have a "small" event (one swap in the indices) and break the inner loop. Then we enter it again, this time the decrement of the last gives 3, 0 -- the last is now zero so it's a "big" event -- "mass swap" in the indices (well there's not much of a mass here, but, there might be;-) and the cycles are back to 3, 2. But now we haven't broken off the for loop, so we continue by decrementing the next-to-last (in this case, the first) -- which gives a minor event, one swap in the indices, and we break the inner loop again. Back to the loop, yet again the last one is decremented, this time giving 2, 1 -- minor event, etc. Eventually a whole for loop occurs with only major events, no minor ones -- that's when the cycles start as all ones, so the decrement takes each to zero (major event), no yield occurs on that last cycle.

Since no break ever executed in that cycle, we take the else branch of the for, which returns. Note that the while n may be a bit misleading: it actually acts as a while True -- n never changes, the while loop only exits from that return statement; it could equally well be expressed as if not n: return followed by while True:, because of course when n is 0 (empty "pool") there's nothing more to yield after the first, trivial empty yield. The author just decided to save a couple of lines by collapsing the if not n: check with the while;-).

I suggest you continue by examining a few more concrete cases -- eventually you should perceive the "clockwork" operating. Focus on just cycles at first (maybe edit the print statements accordingly, removing indices from them), since their clockwork-like progress through their orbit is the key to this subtle and deep algorithm; once you grok that, the way indices get properly updated in response to the sequencing of cycles is almost an anticlimax!-)


It is easier to answer with a pattern in results than words(Except you want to know the math part of the theory), so prints out would be the best way to explain.
The most subtle thing is that, after looping to the end, it would reset itself to the first turn of the last round, and start the next looping down, or continually reset to first turn of the last even the bigger round, like a clock.

The part of code doing the reset job:

         if cycles[i] == 0:
             indices[i:] = indices[i+1:] + indices[i:i+1]
             cycles[i] = n - i

whole:

In [54]: def permutations(iterable, r=None):
    ...:     # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    ...:     # permutations(range(3)) --> 012 021 102 120 201 210
    ...:     pool = tuple(iterable)
    ...:     n = len(pool)
    ...:     r = n if r is None else r
    ...:     if r > n:
    ...:         return
    ...:     indices = range(n)
    ...:     cycles = range(n, n-r, -1)
    ...:     yield tuple(pool[i] for i in indices[:r])
    ...:     print(indices, cycles)
    ...:     while n:
    ...:         for i in reversed(range(r)):
    ...:             cycles[i] -= 1
    ...:             if cycles[i] == 0:
    ...:                 indices[i:] = indices[i+1:] + indices[i:i+1]
    ...:                 cycles[i] = n - i
    ...:                 print("reset------------------")
    ...:                 print(indices, cycles)
    ...:                 print("------------------")
    ...:             else:
    ...:                 j = cycles[i]
    ...:                 indices[i], indices[-j] = indices[-j], indices[i]
    ...:                 print(indices, cycles, i, n-j)
    ...:                 yield tuple(pool[i] for i in indices[:r])
    ...:                 break
    ...:         else:
    ...:             return

part of the result:

In [54]: list(','.join(i) for i in permutations('ABCDE', 3))
([0, 1, 2, 3, 4], [5, 4, 3])
([0, 1, 3, 2, 4], [5, 4, 2], 2, 3)
([0, 1, 4, 2, 3], [5, 4, 1], 2, 4)
reset------------------
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([0, 2, 1, 3, 4], [5, 3, 3], 1, 2)
([0, 2, 3, 1, 4], [5, 3, 2], 2, 3)
([0, 2, 4, 1, 3], [5, 3, 1], 2, 4)
reset------------------
([0, 2, 1, 3, 4], [5, 3, 3])
------------------
([0, 3, 1, 2, 4], [5, 2, 3], 1, 3)
([0, 3, 2, 1, 4], [5, 2, 2], 2, 3)
([0, 3, 4, 1, 2], [5, 2, 1], 2, 4)
reset------------------
([0, 3, 1, 2, 4], [5, 2, 3])
------------------
([0, 4, 1, 2, 3], [5, 1, 3], 1, 4)
([0, 4, 2, 1, 3], [5, 1, 2], 2, 3)
([0, 4, 3, 1, 2], [5, 1, 1], 2, 4)
reset------------------
([0, 4, 1, 2, 3], [5, 1, 3])
------------------
reset------------------(bigger reset)
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([1, 0, 2, 3, 4], [4, 4, 3], 0, 1)
([1, 0, 3, 2, 4], [4, 4, 2], 2, 3)
([1, 0, 4, 2, 3], [4, 4, 1], 2, 4)
reset------------------
([1, 0, 2, 3, 4], [4, 4, 3])
------------------
([1, 2, 0, 3, 4], [4, 3, 3], 1, 2)
([1, 2, 3, 0, 4], [4, 3, 2], 2, 3)
([1, 2, 4, 0, 3], [4, 3, 1], 2, 4)


I recently stumbled upon the very same question during my journey of reimplementing permutation algorithms, and would like to share my understanding of this interesting algorithm.

TL;DR: This algorithm is based on a recursive permutation generation algorithm (backtracking based and utilizes swapping elements), and is transformed (or optimized) into an iteration form. (possibly to improve efficiency and prevent stack overflow)

Basics

Before we start, I have to make sure we use the same notation as the original algorithm.

  • n refers to the length of iterable
  • r refers to the length of one output permutation tuple

And share a simple observation (as discussed by Alex):

  • Whenever the algorithm yield an output, it just takes the first r elements of the indices list.

cycles

First, let’s discuss the variable cycles and build some intuition. With some debugging prints, we can see that cycles act like a countdown (of time or clock, something like 01:00:00 -> 00:59:59 -> 00:59:58):

  • Every item is initialized to range(n, n-r, -1), resulting in cycles[0]=n, cycles[1]=n-1...cycles[i]=n-i
  • Usually, only the last element is decreased, and each decrement (given after the decrement cycles[r-1] !=0) yields an output (a permutation tuple). We can intuitively name this case tick.
  • Whenever an element (assuming that’s cycles[i]) decreases to 0, it triggers a decrease on the element before it (cycles[i-1]). Then the triggering element (cycles[i]) is restored to its initial value (n-i). This behavior is similar to a borrowed minus, or the reset of minutes when the second reaches 0 in a clock countdown. We can intuitively name this branch reset.

To further confirm our intuition, add some print statements to the algorithm, and run it with the parameter iterable="ABCD", r=2. We can see the following changes of the cycles variable. Note that square brackets indicate a “tick” happening, yielding an output, and the curly braces indicates a “reset” happening, which don’t yield output.

[4,3] -> [4,2] -> [4,1] -> {4,0} -> {4,3} -> 
[3,3] -> [3,2] -> [3,1] -> {3,0} -> {3,3} -> 
[2,3] -> [2,2] -> [2,1] -> {2,0} -> {2,3} -> 
[1,3] -> [1,2] -> [1,1] -> {1,0} -> {1,3} -> {0,3} -> {4,3}

Using the initial values and change pattern of cycles, we can come to a possible interpretation of the meaning of cycles: number of the remaining permutations (outputs), at each index. When initialized, cycles[0]=n represents that there is initially n possible choices at index 0, and cycles[1]=n-1 represents that there is initially n-1 possible choices at index 1, all the way down to cycles[r-1]=n-r+1. This interpretation of cycles matches math, as with some simple combinational math calculation we can confirm that is indeed the case. Another supporting evidence is that whenever the algorithm ends, we have P(n,r) ( P(n,r)=n*(n-1)*...*(n-r+1) ) ticks (counting the initial yield before entering while as a tick).

indices

Now we come to the more complex part, the indices list. As this is essentially a recursive algorithm (more precisely backtracking), I would like to start from a sub-problem (i=r-1): When the value from index 0 to index r-2 (inclusive) in indices is fixed, and only the value at index r-1 (in other words, the last element in indices) is changing. Also, I will introduce a concrete example (iterable="ABCDE", r=3), and we will be focusing on how it generates the first 3 outputs: ABC, ABD, ABE.

  • Following the sub-problem, we split the list of indices into 3 parts, and give them names,
    • fixed : indices[0:r-2] (inclusive)
    • changing: indices[r-1] (only one value)
    • backlog: indices[r:n-1] (the remaining parts beside the first two)
  • As this is a backtracking algorithm, we need to keep an invariant unmodified before and after the execution. The invariant is
    • The sublist contains changing and backlog (indices[r-1:n-1]), which is modified during the execution but restored when it ends.
  • Now we can turn to the interaction between cycles and indices during the mysterious while loop. Some of the operations have been outlined by Alex, and I further elaborate.
    • In each tick, the element in the changing part is swapped with some element in the backlog part, and the relative order in the backlog part is maintained.
      • Using the characters to visualize the indices, and curly braces highlights the backlog part:
      • ABC{DE} -> ABD{CE} -> ABE{CD}
    • When reset happens, the element in the changing part is moved to the back of backlog, thus restoring the initial layout of the sublist (containing the changing part and the backlog part)
      • Using the characters to visualize the indices, and curly braces highlights the changing part:
      • AB{E}CD -> ABCD{E}
  • During this execution (of i=r-1), only the tick phase can yield outputs, and it yields n-r+1 outputs in total, matching the initial value of cycles[i]. This is also a result of mathematically we can only have n-r+1 permutation choices when the fixed part is fixed.
  • After cycles[i] is decreased to 0, the reset phase kicks in, resetting cycles[i] to n-r+1 and restoring the invariant sublist. This phase marks the end of this execution, and indicates that all possible permutation choices giving the fixed prefix part have been outputted.
  • Therefore, we have shown that, in this sub-problem (i=r-1), this algorithm is indeed a valid backtracking algorithm, as it
    • Outputs all possible values given the precondition (fixed prefix part)
    • Keeps the invariant unmodified (restored in reset phase)
  • This proof(?) can also be generalized to other values of i, thus proofing(?) the correctness of this permutation generation algorithm.

Reimplementation

Phew! That’s a long read, and you may want to have some more tinkering (more print) with the algorithm to be fully convinced. In essence, we can simplify the underlying principle of the algorithm as the following pseudo-code:

// precondition: the fixed part (or prefix) is fixed
OUTPUT initial_permutation // also invokes the next level
WHILE remaining_permutation_count > 0
    // tick
    swap the changing element with an element in backlog
    OUTPUT current_permutation // also invokes the next level
// reset
move the changing element behind the backlog

And here is a Python implementation using simple backtracking:

# helpers
def swap(list, i, j):
    list[i], list[j] = list[j], list[i]

def move_to_last(list, i):
    list[i:] = list[i+1:] + [list[i]]

def print_first_n_element(list, n):
    print("".join(list[:n]))

# backtracking dfs
def permutations(list, r, changing_index):
    if changing_index == r:
        # we've reached the deepest level
        print_first_n_element(list, r)
        return
    
    # a pseudo `tick`
    # process initial permutation
    # which is just doing nothing (using the initial value)
    permutations(list, r, changing_index + 1)

    # note: initial permutaion has been outputed, thus the minus 1
    remaining_choices = len(list) - 1 - changing_index
    # for (i=1;i<=remaining_choices;i++)
    for i in range(1, remaining_choices+1):
        # `tick` phases
        
        # make one swap
        swap_idx = changing_index + i
        swap(list, changing_index, swap_idx)
        # finished one move at current level, now go deeper
        permutations(list, r, changing_index + 1)
    
    # `reset` phase
    move_to_last(list, changing_index)

# wrapper
def permutations_wrapper(list, r):
    permutations(list, r, 0)

# main
if __name__ == "__main__":
    my_list = ["A", "B", "C", "D"]
    permutations_wrapper(my_list, 2)

Now all the remaining step is just to show that the backtracking version is equivalent to the iteration version in itertools source code. It should be pretty easy once you grasp why this algorithm works. Following the great tradition of various CS textbooks, this is left as an exercise to the reader.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜