Is there an elegant way to keep track of sets of connected items in Python?
For a certain piece of code, I needed to find a way to recognise certain aliases. Thing is, it is not known beforehand what those aliases are.
These are my requirements:
- If A and B are aliases, and B and C are aliases, A and C should be aliases as well.
- Two sets of aliases should be merged when they are connected in any way.
- In each set of aliases, one should be the primary alias.
I use the following solution, making use of what boils down to a dictionary of sets:
class Alias(object):
def __init__(self, initial):
self._set = {initial}
self.initial = initial
def add(self, alias):
self._set.add(alias)
def merge(self, other):
self._set.update(other._set)
def __iter__(self):
return iter(self._set)
class AliasDict(object):
def __init__(self):
self._dict = {}
def add(self, one, other):
if one in self._dict:
if other in self._dict: #merge!
self._dict[one].merge(self._dict[other])
for k in self._dict[other]:
self._dict[k] = self._dict[one]
else:
self._dict[one].add(other)
elif other in self._dict:
self._dict[other].add(one)
else:
self._dict[one] = self._dict[other] = Alias(one)
self._dict[one].add(other)
def get(self, n):
return self._dict.get(n)
def __contains__(self, s):
return s in self._dict
开发者_运维知识库
Could this be improved upon? For example, by making use of a class in the standard library (I've searched, but I might have missed something useful.)
Have you considered using a disjoint set? It is virtually O(1)
in speed, easy to implement, and seems to match your requirements exactly.
This is something you can map on a graph so I'd do:
from networkx import Graph
from networkx.algorithms.components.connected import connected_components
# see aliases as the edges between nodes in a graph
aliases = [('A', 'B'), ('B', 'C'), ('D','E')]
g = Graph( aliases )
# connected components are alias groups
print connected_components(g) # [['A', 'C', 'B'], ['E', 'D']]
You don't specify which alias should be the primary one, so you might as well pick the first one from these lists.
networkx
module
精彩评论