开发者

Implementing Disjoint Set System In Python

What I have so far is largely based off page 571 of "Introduction To Algorithms" by Cormen et al.

I have a Node class in python that represents a set:

class Node:
    def __init__(self, parent, rank = 0):
        self.parent = parent
        self.rank = rank

This implementation is going to use a List of Nodes as the forest (I am open to better ways to store the sets).

Initialize() returns a list of Nodes, that I will store in variable set and pass into the other functions.

Find searches through the forest for the value and returns the set that it appears in. I chose to use for s in range(len(set)): so that in the recursion I could shrink the list of sets being passed in by set[s:].

def Find(set, value):
    for s in range(len(set)):
        if value != set[s].parent:
            set[开发者_JAVA技巧s].parent = Find(set[s:], set[s].parent)
        return set[s]

Merge merges sets in the forest by finding them and promoting the higher ranked set.

def Merge(set, value1, value2):
    set_val_1 = Find(set, value1)
    set_val_2 = Find(set, value2)
    if set_val_1.rank > set_val_2.rank:
        set_val_2.parent = set_val_1
    else:
        set_val_1.parent = set_val_2
        if set_val_1.rank == set_val_2.rank:
            set_val_2.rank += 1

I am getting some errors when I do Finds and Merges, namely Find is not returning the proper set, and so I am not sure if Merge is working properly as well. I would appreciate some help to make sure the functions are implemented properly.


Have you looked at any other existing implementations?


I don't have the latest edition of the book, but this doesn't look quite like a disjoint-set forest.

I think your mistake is to think that the forest has to be stored in a collection and that you have to traverse this collection to do the operations on the nodes. Remove set from Merge() and Find() and implement Find() as

def Find(n):
    if n != n.parent:
        n.parent = Find(n.parent)
    return n.parent

just like in the book. Then add a MakeSet() that returns a single correctly initialized node, and maybe a SameSet() function too:

def SameSet(n1, n2):
    return Find(n1) == Find(n2)

You now have a working disjoint set implementation.


Presuming that each node is initialised to be its own parent:

def Find(node):
    while node is not node.parent:
        node = node.parent
    return node


Using this implementation as a starting point, I've created a new python class to handle disjoint sets, which also supports persistence using a MongoDb.

With my class you should be able, for example, to:

  • Create an instance of UnionFind(), which uses python build-in dictionaries by default, and decide later to consolidate() your results in a MongoDb collection
  • Create an instance of UnionFind from a MongoDb collection and directly use it.

You might want to check the code on github.

Cheers, Simone


The Wikipedia page provides pseudocode for the basic operations on disjoint-set data structure. Here's a direct port to Python (employs path compression and union by rank):

class Node:
    """Represents an element of a set."""
    def __init__(self, id):
        self.id = id
        self.parent = self
        self.rank = 0
        self.size = 1

    def __repr__(self):
        return 'Node({!r})'.format(self.id)


def Find(x):
    """Returns the representative object of the set containing x."""
    if x.parent is not x:
        x.parent = Find(x.parent)
    return x.parent


def Union(x, y):
    """Combines the sets x and y belong to."""
    xroot = Find(x)
    yroot = Find(y)

    # x and y are already in the same set
    if xroot is yroot:
        return

    # x and y are not in same set, so we merge them
    if xroot.rank < yroot.rank:
        xroot, yroot = yroot, xroot  # swap xroot and yroot

    # merge yroot into xroot
    yroot.parent = xroot
    xroot.size += yroot.size
    if xroot.rank == yroot.rank:
        xroot.rank = xroot.rank + 1

Demo:

>>> a, b, c = map(Node, 'abc')
>>> Find(a), Find(b), Find(c)
(Node('a'), Node('b'), Node('c'))
>>> Find(a).size
1
>>> Union(a, b)
>>> Union(b, c)
>>> Find(a), Find(b), Find(c)
(Node('a'), Node('a'), Node('a'))
>>> Find(a).size
3
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜