开发者

Forming triangles from points and relations

I want to generate triangles from points and optional relations between them. Not all points form triangles, but many of them do.

In the initial structure, I've got a database with the following tables:

Nodes(id, value)

Relations(id, nodeA, nodeB, value)

Triangles(id, relation1_id, relation2_id, relation3_id)

In order to generate triangles from both nodes and relations table, I've used the following query:

INSERT INTO Triangles
SELECT t1.id, t2.id , t3.id, 
FROM Relations t1, Relations t2, Relations t3
WHERE t1.id < t2.id AND t3.id > t1.id AND
(
t1.nodeA = t2.nodeA 
AND (t3.nodeA = t1.nodeB AND t3.nodeB = t2.nodeB
OR t3.nodeA = t2.nodeB AND t3.nodeB = t1.nodeB)

OR 

t1.nodeA = t2.nodeB
AND (t3.nodeA = t1.nodeB AND t3.nodeB = t2.nodeA
OR t3.nodeA = t2.nodeA AND t3.nodeB = t1.nodeB)
)

It's working perfectly on small sized data. (~< 50 points) In some cases however, I've got around 100 points all related to each other which leads to thousands of relations. So when the expected number of triangles is in the hundreds of thousands, or even in 开发者_StackOverflowthe millions, the query might take several hours.

My main problem is not in the select query, while I see it execute in Management Studio, the returned results slow. I received around 2000 rows per minute, which is not acceptable for my case.

As a matter of fact, the size of operations is being added up exponentionally and that is terribly affecting the performance.

I've tried doing it as a LINQ to object from my code, but the performance was even worse.

I've also tried using SqlBulkCopy on a reader from C# on the result, also with no luck.

So the question is... Any ideas or workarounds?


I think you just need this:

INSERT
INTO    triangles (relation1_id, relation2_id, relation3_id)
SELECT  r12.id, r23.id, r13.id
FROM    relations r12
JOIN    relations r23
ON      r23.nodeA = r12.nodeB
JOIN    relations r13
ON      r13.nodeA = r12.nodeA
        AND r13.nodeB = r23.nodeB
ON      r12.nodeA = n1.id
WHERE   NOT EXISTS
        (
        SELECT  relation1_id, relation2_id, relation3_id
        FROM    triangles
        INTERSECT
        SELECT  r12.id, r23.id, r13.id
        )

Make the following check constraint:

ALTER TABLE relations ADD CONSTRAINT CHECK (nodeA < nodeB)

This point is that since the relations are symmetric, you only need to store the relation once per pair and the triangle once per permutation.

The check constraint ensures that the relations are stored once per pair in the fixed order.

The query is designed so that the triangles are also stored in fixed order: 1 - 3 first, 2 - 3 second, 1 - 3 third, where 1, 2 and 3 are determined by the order of nodes the relations link.

Also note that the number of triangles grows as O(n^3), not exponentially (O(exp(N))).

For 100 nodes, there can be at most 100 * 99 * 98 / 6 = 161700 triangles.


The query you are using has some problems:

  1. "t1.id < t2.id AND t3.id > t1.id" doesn't prevent t2 = t3. Change this to "WHERE t1.id < t2.id AND t3.id > t2.id"
  2. t1.nodeA might not exist in t2 at all.

Quassnoi's solution looks promising. It has problems as well. First of all, it is selecting node IDs, not relation IDs, into triangles. Second, it isn't checking that a relation between n1 and n3 exists. I'd rewrite the statement as:

INSERT
INTO    triangles (relation1_id, relation2_id, relation3_id)
SELECT  r12.id, r23.id, r13.id
FROM    relations r12
JOIN    relations r23
ON      r12.nodeB = r23.nodeA
JOIN    relations r13
ON      r12.nodeA = r13.nodeA AND r23.nodeB = r13.nodeB
WHERE   NOT EXISTS
        (
        SELECT  relation1_id, relation2_id, relation3_id
        FROM    triangles
        WHERE   relation1_id = r12.id AND relation2_id = r23.id AND relation3_id = r13.id
        )

This solution doesn't depend on any constraint on ordering of IDs in the relations table, and doesn't guarantee any order of IDs in the triangles table.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜