Slicing a list into n nearly-equal-length partitions [duplicate]
I'm looking for a fast, clean, pythonic way to divide a list into exactly n nearly-equal partitions.
partition([1,2,3,4,5],5)->[[1],[2],[3],[4],[5]]
partition([1,2,3,4,5],2)->[[1,2],[3,4,5]] (or [[1,2,3],[4,5]])
partition([1,2,3,4,5],3)->[[1,2],[3,4],[5]] (there are other ways to slice this one too)
There are several answers in here Iteration over list slices that run very close to what I want, except they are focused on the size of the list, and I care about the number of the lists (some of them also pad with None). These are trivially converted, obviously, but I'm looking for a best practice.
Similarly, people have pointed out great solutions here How do you split a list into evenly sized chunks? for a very similar problem, but I'm more interested in the number of partitions than the specific size, as long as it's within 1. Again, this is trivially convertible, but I'm looking for a best practice.
Just a different take, that only works if [[1,3,5],[2,4]]
is an acceptable partition, in your example.
def partition ( lst, n ):
return [ lst[i::n] for i in xrange(n) ]
This satisfies the example mentioned in @Daniel Stutzbach's example:
partition(range(105),10)
# [[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
# [1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101],
# [2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 102],
# [3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 103],
# [4, 14, 24, 34, 44, 54, 64, 74, 84, 94, 104],
# [5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
# [6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
# [7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
# [8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
# [9, 19, 29, 39, 49, 59, 69, 79, 89, 99]]
Here's a version that's similar to Daniel's: it divides as evenly as possible, but puts all the larger partitions at the start:
def partition(lst, n):
q, r = divmod(len(lst), n)
indices = [q*i + min(i, r) for i in xrange(n+1)]
return [lst[indices[i]:indices[i+1]] for i in xrange(n)]
It also avoids the use of float arithmetic, since that always makes me uncomfortable. :)
Edit: an example, just to show the contrast with Daniel Stutzbach's solution
>>> print [len(x) for x in partition(range(105), 10)]
[11, 11, 11, 11, 11, 10, 10, 10, 10, 10]
def partition(lst, n):
division = len(lst) / float(n)
return [ lst[int(round(division * i)): int(round(division * (i + 1)))] for i in xrange(n) ]
>>> partition([1,2,3,4,5],5)
[[1], [2], [3], [4], [5]]
>>> partition([1,2,3,4,5],2)
[[1, 2, 3], [4, 5]]
>>> partition([1,2,3,4,5],3)
[[1, 2], [3, 4], [5]]
>>> partition(range(105), 10)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]
Python 3 version:
def partition(lst, n):
division = len(lst) / n
return [lst[round(division * i):round(division * (i + 1))] for i in range(n)]
Below is one way.
def partition(lst, n):
increment = len(lst) / float(n)
last = 0
i = 1
results = []
while last < len(lst):
idx = int(round(increment * i))
results.append(lst[last:idx])
last = idx
i += 1
return results
If len(lst) cannot be evenly divided by n, this version will distribute the extra items at roughly equal intervals. For example:
>>> print [len(x) for x in partition(range(105), 10)]
[11, 10, 11, 10, 11, 10, 11, 10, 11, 10]
The code could be simpler if you don't mind all of the 11s being at the beginning or the end.
This answer provides a function split(list_, n, max_ratio)
, for people
who want to split their list into n
pieces with at most max_ratio
ratio in piece length. It allows for more variation than the
questioner's 'at most 1 difference in piece length'.
It works by sampling n piece lengths within the desired ratio range [1 , max_ratio), placing them after each other to form a 'broken stick' with the right distances between the 'break points' but the wrong total length. Scaling the broken stick to the desired length gives us the approximate positions of the break points we want. To get integer break points requires subsequent rounding.
Unfortunately, the roundings can conspire to make pieces just too short, and let you exceed the max_ratio. See the bottom of this answer for an example.
import random
def splitting_points(length, n, max_ratio):
"""n+1 slice points [0, ..., length] for n random-sized slices.
max_ratio is the largest allowable ratio between the largest and the
smallest part.
"""
ratios = [random.uniform(1, max_ratio) for _ in range(n)]
normalized_ratios = [r / sum(ratios) for r in ratios]
cumulative_ratios = [
sum(normalized_ratios[0:i])
for i in range(n+1)
]
scaled_distances = [
int(round(r * length))
for r in cumulative_ratios
]
return scaled_distances
def split(list_, n, max_ratio):
"""Slice a list into n randomly-sized parts.
max_ratio is the largest allowable ratio between the largest and the
smallest part.
"""
points = splitting_points(len(list_), n, ratio)
return [
list_[ points[i] : points[i+1] ]
for i in range(n)
]
You can try it out like so:
for _ in range(10):
parts = split('abcdefghijklmnopqrstuvwxyz', 4, 2)
print([(len(part), part) for part in parts])
Example of a bad result:
parts = split('abcdefghijklmnopqrstuvwxyz', 10, 2)
# lengths range from 1 to 4, not 2 to 4
[(3, 'abc'), (3, 'def'), (1, 'g'),
(4, 'hijk'), (3, 'lmn'), (2, 'op'),
(2, 'qr'), (3, 'stu'), (2, 'vw'),
(3, 'xyz')]
精彩评论