Query for A Likes B, B Likes C, Two Degrees of Separation
suppose you have the following table named Likes:
A|B
---
a|b
a|f
a|e
a|i
b|a
b|i
c|d
e|p
In this table, values in A represent people who "like" people in B. So, a likes b, a likes f, a likes e, and so forth. How do you write a query such that you get the number of different users who are two degrees of separation from each user? So as an example, if a likes b, then b is one degree of 开发者_开发知识库separation from a. If a likes b, and b likes c, then c is two degrees of separation from a. One more example, if a likes b, and b likes a, then a is two degrees of separation from itself (we don't exclude cycles). So the output should be something like this:
User|CountOfUsersWhoAreTwoDegreesFromUser
-----------------------------------------
a | -
b | -
c | -
e | -
Now, I'm not sure what our counts would be for each user, so I didn't write it in the above table. Also, no persons in the table Likes like themselves. So you will not see a combination like a|a in Likes, or b|b in Likes. Can anyone help me out with this?
select primary.A, count(*)
from likes primary
inner join likes secondary on primary.B = secondary.A
group by primary.A
Since you need consider only two connections at once, this can be done with joins. (If you had to consider the full closure of the Likes relation, then you would need the full power of recursion, such as an implementation of Dijkstra's algorithm.)
SELECT X.A AS user, COUNT(DISTINCT Y.B) AS countOfUsersWhoAreTwoDegreesFromUser
FROM Likes AS X
INNER JOIN Likes AS Y
ON X.B = Y.A
GROUP BY user
EDIT: To be clear, this problem is simple and reasonable efficient for any fixed degree of separation.y
EDIT 2: Here's a variant solution which will prevent a user from being counted as two degrees from themselves. This varies from the literal problem description, but might be what was intended.
SELECT X.A AS user, COUNT(DISTINCT Y.B) AS countOfUsersWhoAreTwoDegreesFromUser
FROM Likes AS X
INNER JOIN Likes AS Y
ON X.B = Y.A
WHERE X.A <> Y.B
GROUP BY user
NOTE: this approach is not Scalable ... If you are interested ONLY and ONLY in two degrees, we can go for a self join ...
Condition T1.A <> T2.B is to filter out A liking A, Distinct is applied so that even if A like C by two degrees by two different path, its still counted as 1.
SELECT T.A, Count(T.B)
FROM
(
SELECT DISTINCT T1.A, T2.B
FROM Table1 T1
INNER JOIN Table1 T2 on T1.B = T2.A AND T1.A <> T2.B
) T
GROUP BY T.A
To cope with an arbitrary value for degree, a CTE could be used in PostgreSQL:
with recursive graph (a, b, path, degree) as ( select a, b, array[a::text, b::text] as path, 1 as degree from likes union all select l.a, l.b, g.path || l.b::text, g.degree + 1 from likes l join graph g on l.a = g.b and l.b g.a ) select * from graph where degree = 2
精彩评论