Creating restricted permutations of a list of items by category
I am trying to create a number of restricted permutations of a list of items. Each item has a category, and I need to find combinations of items such that each combination does not have multiple items from the same category. To illustrate, here's some sample data:
Name | Category
==========|==========
1. Orange | fruit
2. Apple | fruit
3. GI-Joe | toy
4. VCR | electronics
5. Racquet | sporting goods
The combinations would be restricted to length three, I don't need every combination of every length. So a set of combinations for the above list could be:
(Orange, GI-Joe, VCR)
(Orange, GI-Joe, Racquet)
(Orange, VCR, Racquet)
(Apple, GI-Joe, VCR)
(Apple, GI-Joe, Racquet)
... and so on.
I do this fairly often, on various lists. The lists will never be more than 40 items in length, but understandably that could create thousands of combinations (though there will likely be around 10 unique categories per each list, restricting it somewhat)
I've come up with some pseudo-python for how I would implement it recursively. It's been too long since I took combinatorics, but from what I rec开发者_如何学Goall this is essentially a subset of the combinations of the set, something like C(list length, desired size). There's likely some library modules which can make this cleaner (or at least more performant)
I was wondering if perhaps there was a better approach than what I've got (perhaps one which uses itertools.combinations
somehow):
# For the sake of this problem, let's assume the items are hashable so they
# can be added to a set.
def combinate(items, size=3):
assert size >=2, "You jerk, don't try it."
def _combinate(index, candidate):
if len(candidate) == size:
results.add(candidate)
return
candidate_cats = set(x.category for x in candidate)
for i in range(index, len(items)):
item = items[i]
if item.category not in candidate_cats:
_combinate(i, candidate + (item, ))
results = set()
for i, item in enumerate(items[:(1-size)]):
_combinate(i, (item, ))
return results
Naive approach:
#!/usr/bin/env python
import itertools
items = {
'fruits' : ('Orange', 'Apple'),
'toys' : ('GI-Joe', ),
'electronics' : ('VCR', ),
'sporting_goods' : ('Racquet', )
}
def combinate(items, size=3):
if size > len(items):
raise Exception("Lower the `size` or add more products, dude!")
for cats in itertools.combinations(items.keys(), size):
cat_items = [[products for products in items[cat]] for cat in cats]
for x in itertools.product(*cat_items):
yield zip(cats, x)
if __name__ == '__main__':
for x in combinate(items):
print x
Will yield:
# ==>
#
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('sporting_goods', 'Racquet')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Apple')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
What you seek to generate is the Cartesian product of elements taken from set of category
.
The partitioning into multiple sets is relatively easy:
item_set[category].append(item)
With proper instantiation (e.g.) collections.defaultdict for item_set[category]
and then itertools.product
will give you the desired output.
精彩评论