How to split a list into pairs in all possible ways
I have a list (say 6 elements for simplicity)
L = [0, 1, 2, 3, 4, 5]
and I want to chunk it into p开发者_如何学Pythonairs in ALL possible ways. I show some configurations:
[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
and so on.
Here (a, b) = (b, a)
and the order of pairs is not important i.e.
[(0, 1), (2, 3), (4, 5)] = [(0, 1), (4, 5), (2, 3)]
The total number of such configurations is 1*3*5*...*(N-1)
where N
is the length of my list.
How can I write a generator in Python that gives me all possible configurations for an arbitrary N
?
Take a look at itertools.combinations
.
matt@stanley:~$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import itertools
>>> list(itertools.combinations(range(6), 2))
[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]
I don't think there's any function in the standard library that does exactly what you need. Just using itertools.combinations
can get you a list of all possible individual pairs, but doesn't actually solve the problem of all valid pair combinations.
You could solve this easily with:
import itertools
def all_pairs(lst):
for p in itertools.permutations(lst):
i = iter(p)
yield zip(i,i)
But this will get you duplicates as it treats (a,b) and (b,a) as different, and also gives all orderings of pairs. In the end, I figured it's easier to code this from scratch than trying to filter the results, so here's the correct function.
def all_pairs(lst):
if len(lst) < 2:
yield []
return
if len(lst) % 2 == 1:
# Handle odd length list
for i in range(len(lst)):
for result in all_pairs(lst[:i] + lst[i+1:]):
yield result
else:
a = lst[0]
for i in range(1,len(lst)):
pair = (a,lst[i])
for rest in all_pairs(lst[1:i]+lst[i+1:]):
yield [pair] + rest
It's recursive, so it will run into stack issues with a long list, but otherwise does what you need.
>>> for x in all_pairs([0,1,2,3,4,5]): print x [(0, 1), (2, 3), (4, 5)] [(0, 1), (2, 4), (3, 5)] [(0, 1), (2, 5), (3, 4)] [(0, 2), (1, 3), (4, 5)] [(0, 2), (1, 4), (3, 5)] [(0, 2), (1, 5), (3, 4)] [(0, 3), (1, 2), (4, 5)] [(0, 3), (1, 4), (2, 5)] [(0, 3), (1, 5), (2, 4)] [(0, 4), (1, 2), (3, 5)] [(0, 4), (1, 3), (2, 5)] [(0, 4), (1, 5), (2, 3)] [(0, 5), (1, 2), (3, 4)] [(0, 5), (1, 3), (2, 4)] [(0, 5), (1, 4), (2, 3)]
How about this:
items = ["me", "you", "him"]
[(items[i],items[j]) for i in range(len(items)) for j in range(i+1, len(items))]
[('me', 'you'), ('me', 'him'), ('you', 'him')]
or
items = [1, 2, 3, 5, 6]
[(items[i],items[j]) for i in range(len(items)) for j in range(i+1, len(items))]
[(1, 2), (1, 3), (1, 5), (1, 6), (2, 3), (2, 5), (2, 6), (3, 5), (3, 6), (5, 6)]
Conceptually similar to @shang's answer, but it does not assume that groups are of size 2:
import itertools
def generate_groups(lst, n):
if not lst:
yield []
else:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in generate_groups([x for x in lst if x not in group], n):
yield [group] + groups
pprint(list(generate_groups([0, 1, 2, 3, 4, 5], 2)))
This yields:
[[(0, 1), (2, 3), (4, 5)],
[(0, 1), (2, 4), (3, 5)],
[(0, 1), (2, 5), (3, 4)],
[(0, 2), (1, 3), (4, 5)],
[(0, 2), (1, 4), (3, 5)],
[(0, 2), (1, 5), (3, 4)],
[(0, 3), (1, 2), (4, 5)],
[(0, 3), (1, 4), (2, 5)],
[(0, 3), (1, 5), (2, 4)],
[(0, 4), (1, 2), (3, 5)],
[(0, 4), (1, 3), (2, 5)],
[(0, 4), (1, 5), (2, 3)],
[(0, 5), (1, 2), (3, 4)],
[(0, 5), (1, 3), (2, 4)],
[(0, 5), (1, 4), (2, 3)]]
My boss is probably not going to be happy I spent a little time on this fun problem, but here's a nice solution that doesn't need recursion, and uses itertools.product
. It's explained in the docstring :). The results seem OK, but I haven't tested it too much.
import itertools
def all_pairs(lst):
"""Generate all sets of unique pairs from a list `lst`.
This is equivalent to all _partitions_ of `lst` (considered as an indexed
set) which have 2 elements in each partition.
Recall how we compute the total number of such partitions. Starting with
a list
[1, 2, 3, 4, 5, 6]
one takes off the first element, and chooses its pair [from any of the
remaining 5]. For example, we might choose our first pair to be (1, 4).
Then, we take off the next element, 2, and choose which element it is
paired to (say, 3). So, there are 5 * 3 * 1 = 15 such partitions.
That sounds like a lot of nested loops (i.e. recursion), because 1 could
pick 2, in which case our next element is 3. But, if one abstracts "what
the next element is", and instead just thinks of what index it is in the
remaining list, our choices are static and can be aided by the
itertools.product() function.
"""
N = len(lst)
choice_indices = itertools.product(*[
xrange(k) for k in reversed(xrange(1, N, 2)) ])
for choice in choice_indices:
# calculate the list corresponding to the choices
tmp = lst[:]
result = []
for index in choice:
result.append( (tmp.pop(0), tmp.pop(index)) )
yield result
cheers!
A non-recursive function to find all the possible pairs where the order does not matter, i.e., (a,b) = (b,a)
def combinantorial(lst):
count = 0
index = 1
pairs = []
for element1 in lst:
for element2 in lst[index:]:
pairs.append((element1, element2))
index += 1
return pairs
Since it is non-recursive you won't experience memory issues with long lists.
Example of usage:
my_list = [1, 2, 3, 4, 5]
print(combinantorial(my_list))
>>>
[(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]
Try the following recursive generator function:
def pairs_gen(L):
if len(L) == 2:
yield [(L[0], L[1])]
else:
first = L.pop(0)
for i, e in enumerate(L):
second = L.pop(i)
for list_of_pairs in pairs_gen(L):
list_of_pairs.insert(0, (first, second))
yield list_of_pairs
L.insert(i, second)
L.insert(0, first)
Example usage:
>>> for pairs in pairs_gen([0, 1, 2, 3, 4, 5]):
... print pairs
...
[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
[(0, 2), (1, 3), (4, 5)]
[(0, 2), (1, 4), (3, 5)]
[(0, 2), (1, 5), (3, 4)]
[(0, 3), (1, 2), (4, 5)]
[(0, 3), (1, 4), (2, 5)]
[(0, 3), (1, 5), (2, 4)]
[(0, 4), (1, 2), (3, 5)]
[(0, 4), (1, 3), (2, 5)]
[(0, 4), (1, 5), (2, 3)]
[(0, 5), (1, 2), (3, 4)]
[(0, 5), (1, 3), (2, 4)]
[(0, 5), (1, 4), (2, 3)]
I made a small test suite for all the compliant solutions. I had to change the functions a bit to get them to work in Python 3. Interestingly, the fastest function in PyPy is the slowest function in Python 2/3 in some cases.
import itertools
import time
from collections import OrderedDict
def tokland_org(lst, n):
if not lst:
yield []
else:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in tokland_org([x for x in lst if x not in group], n):
yield [group] + groups
tokland = lambda x: tokland_org(x, 2)
def gatoatigrado(lst):
N = len(lst)
choice_indices = itertools.product(*[
range(k) for k in reversed(range(1, N, 2)) ])
for choice in choice_indices:
# calculate the list corresponding to the choices
tmp = list(lst)
result = []
for index in choice:
result.append( (tmp.pop(0), tmp.pop(index)) )
yield result
def shang(X):
lst = list(X)
if len(lst) < 2:
yield lst
return
a = lst[0]
for i in range(1,len(lst)):
pair = (a,lst[i])
for rest in shang(lst[1:i]+lst[i+1:]):
yield [pair] + rest
def smichr(X):
lst = list(X)
if not lst:
yield [tuple()]
elif len(lst) == 1:
yield [tuple(lst)]
elif len(lst) == 2:
yield [tuple(lst)]
else:
if len(lst) % 2:
for i in (None, True):
if i not in lst:
lst = lst + [i]
PAD = i
break
else:
while chr(i) in lst:
i += 1
PAD = chr(i)
lst = lst + [PAD]
else:
PAD = False
a = lst[0]
for i in range(1, len(lst)):
pair = (a, lst[i])
for rest in smichr(lst[1:i] + lst[i+1:]):
rv = [pair] + rest
if PAD is not False:
for i, t in enumerate(rv):
if PAD in t:
rv[i] = (t[0],)
break
yield rv
def adeel_zafar(X):
L = list(X)
if len(L) == 2:
yield [(L[0], L[1])]
else:
first = L.pop(0)
for i, e in enumerate(L):
second = L.pop(i)
for list_of_pairs in adeel_zafar(L):
list_of_pairs.insert(0, (first, second))
yield list_of_pairs
L.insert(i, second)
L.insert(0, first)
if __name__ =="__main__":
import timeit
import pprint
candidates = dict(tokland=tokland, gatoatigrado=gatoatigrado, shang=shang, smichr=smichr, adeel_zafar=adeel_zafar)
for i in range(1,7):
results = [ frozenset([frozenset(x) for x in candidate(range(i*2))]) for candidate in candidates.values() ]
assert len(frozenset(results)) == 1
print("Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty")
times = dict([(k, timeit.timeit('list({0}(range(6)))'.format(k), setup="from __main__ import {0}".format(k), number=10000)) for k in candidates.keys()])
pprint.pprint([(k, "{0:.3g}".format(v)) for k,v in OrderedDict(sorted(times.items(), key=lambda t: t[1])).items()])
print("Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty")
times = dict([(k, timeit.timeit('list(islice({0}(range(52)), 800))'.format(k), setup="from itertools import islice; from __main__ import {0}".format(k), number=100)) for k in candidates.keys()])
pprint.pprint([(k, "{0:.3g}".format(v)) for k,v in OrderedDict(sorted(times.items(), key=lambda t: t[1])).items()])
"""
print("The 10000th permutations of the previous series:")
gens = dict([(k,v(range(52))) for k,v in candidates.items()])
tenthousands = dict([(k, list(itertools.islice(permutations, 10000))[-1]) for k,permutations in gens.items()])
for pair in tenthousands.items():
print(pair[0])
print(pair[1])
"""
They all seem to generate the exact same order, so the sets aren't necessary, but this way it's future proof. I experimented a bit with the Python 3 conversion, it is not always clear where to construct the list, but I tried some alternatives and chose the fastest.
Here are the benchmark results:
% echo "pypy"; pypy all_pairs.py; echo "python2"; python all_pairs.py; echo "python3"; python3 all_pairs.py
pypy
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.0626'),
('adeel_zafar', '0.125'),
('smichr', '0.149'),
('shang', '0.2'),
('tokland', '0.27')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('gatoatigrado', '0.29'),
('adeel_zafar', '0.411'),
('smichr', '0.464'),
('shang', '0.493'),
('tokland', '0.553')]
python2
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.344'),
('adeel_zafar', '0.374'),
('smichr', '0.396'),
('shang', '0.495'),
('tokland', '0.675')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('adeel_zafar', '0.773'),
('shang', '0.823'),
('smichr', '0.841'),
('tokland', '0.948'),
('gatoatigrado', '1.38')]
python3
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.385'),
('adeel_zafar', '0.419'),
('smichr', '0.433'),
('shang', '0.562'),
('tokland', '0.837')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('smichr', '0.783'),
('shang', '0.81'),
('adeel_zafar', '0.835'),
('tokland', '0.969'),
('gatoatigrado', '1.3')]
% pypy --version
Python 2.7.12 (5.6.0+dfsg-0~ppa2~ubuntu16.04, Nov 11 2016, 16:31:26)
[PyPy 5.6.0 with GCC 5.4.0 20160609]
% python3 --version
Python 3.5.2
So I say, go with gatoatigrado's solution.
def f(l):
if l == []:
yield []
return
ll = l[1:]
for j in range(len(ll)):
for end in f(ll[:j] + ll[j+1:]):
yield [(l[0], ll[j])] + end
Usage:
for x in f([0,1,2,3,4,5]):
print x
>>>
[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
[(0, 2), (1, 3), (4, 5)]
[(0, 2), (1, 4), (3, 5)]
[(0, 2), (1, 5), (3, 4)]
[(0, 3), (1, 2), (4, 5)]
[(0, 3), (1, 4), (2, 5)]
[(0, 3), (1, 5), (2, 4)]
[(0, 4), (1, 2), (3, 5)]
[(0, 4), (1, 3), (2, 5)]
[(0, 4), (1, 5), (2, 3)]
[(0, 5), (1, 2), (3, 4)]
[(0, 5), (1, 3), (2, 4)]
[(0, 5), (1, 4), (2, 3)]
L = [1, 1, 2, 3, 4]
answer = []
for i in range(len(L)):
for j in range(i+1, len(L)):
if (L[i],L[j]) not in answer:
answer.append((L[i],L[j]))
print answer
[(1, 1), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
Hope this helps
Hope this will help:
L = [0, 1, 2, 3, 4, 5]
[(i,j) for i in L for j in L]
output:
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]
This code works when the length of the list is not a multiple of 2; it employs a hack to make it work. Perhaps there are better ways to do this...It also ensures that the pairs are always in a tuple and that it works whether the input is a list or tuple.
def all_pairs(lst):
"""Return all combinations of pairs of items of ``lst`` where order
within the pair and order of pairs does not matter.
Examples
========
>>> for i in range(6):
... list(all_pairs(range(i)))
...
[[()]]
[[(0,)]]
[[(0, 1)]]
[[(0, 1), (2,)], [(0, 2), (1,)], [(0,), (1, 2)]]
[[(0, 1), (2, 3)], [(0, 2), (1, 3)], [(0, 3), (1, 2)]]
[[(0, 1), (2, 3), (4,)], [(0, 1), (2, 4), (3,)], [(0, 1), (2,), (3, 4)], [(0, 2)
, (1, 3), (4,)], [(0, 2), (1, 4), (3,)], [(0, 2), (1,), (3, 4)], [(0, 3), (1, 2)
, (4,)], [(0, 3), (1, 4), (2,)], [(0, 3), (1,), (2, 4)], [(0, 4), (1, 2), (3,)],
[(0, 4), (1, 3), (2,)], [(0, 4), (1,), (2, 3)], [(0,), (1, 2), (3, 4)], [(0,),
(1, 3), (2, 4)], [(0,), (1, 4), (2, 3)]]
Note that when the list has an odd number of items, one of the
pairs will be a singleton.
References
==========
http://stackoverflow.com/questions/5360220/
how-to-split-a-list-into-pairs-in-all-possible-ways
"""
if not lst:
yield [tuple()]
elif len(lst) == 1:
yield [tuple(lst)]
elif len(lst) == 2:
yield [tuple(lst)]
else:
if len(lst) % 2:
for i in (None, True):
if i not in lst:
lst = list(lst) + [i]
PAD = i
break
else:
while chr(i) in lst:
i += 1
PAD = chr(i)
lst = list(lst) + [PAD]
else:
PAD = False
a = lst[0]
for i in range(1, len(lst)):
pair = (a, lst[i])
for rest in all_pairs(lst[1:i] + lst[i+1:]):
rv = [pair] + rest
if PAD is not False:
for i, t in enumerate(rv):
if PAD in t:
rv[i] = (t[0],)
break
yield rv
I'm adding in my own contribution, which builds on the great solutions provided by @shang and @tokland. My problem was that in a group of 12, I wanted to also see all the possible pairs when your pair size does not divide perfectly with the group size. For instance, for an input list size of 12, I want to see all possible pairs with 5 elements.
This snip of code and small modification should address that issue:
import itertools
def generate_groups(lst, n):
if not lst:
yield []
else:
if len(lst) % n == 0:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in generate_groups([x for x in lst if x not in group], n):
yield [group] + groups
else:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
group2 = [x for x in lst if x not in group]
for grp in (((group2[0],) + xs2) for xs2 in itertools.combinations(group2[1:], n-1)):
yield [group] + [grp]
Thus, in this case, if I run the following snip of code, I get the results below. The final snip of code is a sanity check that I have no overlapping elements.
i = 0
for x in generate_groups([1,2,3,4,5,6,7,8,9,10,11,12], 5):
print(x)
for elem in x[0]:
if elem in x[1]:
print('break')
break
>>>
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 10)]
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 11)]
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 12)]
[(1, 2, 3, 4, 5), (6, 7, 8, 10, 11)]
[(1, 2, 3, 4, 5), (6, 7, 8, 10, 12)]
[(1, 2, 3, 4, 5), (6, 7, 8, 11, 12)]
[(1, 2, 3, 4, 5), (6, 7, 9, 10, 11)]
...
Not the most efficient or fastest, but probably the easiest. The last line is a simple way to dedupe a list in python. In this case, pairs like (0,1) and (1,0) are in the output. Not sure if you'd consider those duplicates or not.
l = [0, 1, 2, 3, 4, 5]
pairs = []
for x in l:
for y in l:
pairs.append((x,y))
pairs = list(set(pairs))
print(pairs)
Output:
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]
精彩评论