My naive maximal clique finding algorithm runs faster than Bron-Kerbosch's. What's wrong?
In short, my naive code (in Ruby) looks like:
# $seen is a hash to memoize previously seen sets
# $sparse is a hash of usernames to a list of neighboring usernames
# $set is the list of output clusters
$seen = {}
def subgraph(set, adj)
hash = (set + adj).sort
return if $seen[hash]
$sets.push set.sort.join(", ") if adj.empty? and set.size > 2
adj.each {|node| subgraph(set + [node], $sparse[node] & adj)}
$seen[hash] = true
end
$sparse.keys.each do |vertex|
subgraph([vertex], $sparse[vertex])
end
And my Bron Kerbosch implementation:
def bron_kerbosch(set, points, exclude)
$sets.push set.sort.join(', ') if set.size > 2 and exclude.empty? and points.empty?
points.each_with_index do |vertex, i|
points[i] = nil
bron_kerbosch(set + [vertex],
points & $sparse[vertex],
exclude & $sparse[vertex])
exclude.push vertex
end
end
b开发者_运维知识库ron_kerbosch [], $sparse.keys, []
I also implemented pivoting and degeneracy ordering, which cut down on bron_kerbosch execution time, but not enough to overtake my initial solution. It seems wrong that this is the case; what algorithmic insight am I missing? Here is a writeup with more detail if you need to see fully working code. I've tested this on pseudo-random sets up to a million or so edges in size.
I don't know how you generate the random graphs for your tests but I suppose you use a function which generates a number according to a uniform distribution and thus you obtain a graph that is very homogeneous. That's a common problem when testing algorithms on graphs, it is very difficult to create good test cases (it's often as hard as solving the original problem).
The max-clique problem is a well-known NP hard problem and both algorithms (the naive one and the Bron Kerbosch one) have the same complexity so we can't expect a global improvement on all testcase but just an improvement on some particular cases. But because you used a uniform distribution to generate your graph, you don't have this particular case.
That's why performance of both algorithms is very similar on your data. And because Bron Kerbosch algorithm is a little more complex than the naive one, the naive one is faster.
精彩评论