3-clique counting in a graph
I am operating on a (not so) large graph having about 380K edges. I wrote a program to count the number of 3-cliques in the graph. A quick example:
List of edges:
A - B
B - C
C - A
C - D
List of cliques:
A - B - C
MySQL Table structure:
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| v1 | bigint(20) | YES | MUL | NULL | |
| v2 | bigint(20) | YES | MUL | NULL | |
+-------+------------+------+-----+---------+-------+
A 3-clique is nothing but a triangle in a graph. Currently, I am doing this using PHP+MySQL. As expected, it is not fast enough. Is there a way to do this in pure MySQL? (perhaps a way to insert all 3-cliqu开发者_C百科es into a table?)
SELECT T1.v1, T2.v1, T3.v1 FROM TableName T1, TableName T2, TableName T3
WHERE T1.v1 < T1.v2 AND T2.v1 < T2.v2 AND T3.v1 < T3.v2
AND T1.v1 = T3.v1 AND T1.v2 = T2.v1 AND T2.v2 = T3.v2
Should do the trick. What I've done there is ensure that v1 is less than v2 for all edges considered, simply to remove duplicates. Then it is a simple matter of joining up the edges by their start/end points. Returning the first point in each of the pairs.
If you have edges going from a node back to the same node you may need to add in extra checks as appropriate.
EDIT: Made a change, thanks to Legend.Reminded me that we needed to make sure that the edge found in T3 matches up to the edge in T1, so we have to link the first in each together! Originally I had T3.v1 > T3.v2 in the first line of the where clause but changed it to reduce confusion, forgot to change the second part though!
精彩评论