开发者

How to count common neighbors of two users and calculate similarity?

The monthly_connections table contains columns calling_party, called_party, common_neighbors, neighborhood_overlap

So the table describes which users are connected. One of the measures for user similarity is neighborhood overlap which is defined as following:

neighborhood_overlap = (number开发者_如何学运维 of nodes who are neighbors of both calling_party and called_party)/(number of nodes who are neighbors of at least one of calling_party or called_party)

Trying to calculate the number of common neighbors for two users I wrote the following query:

SELECT 
COUNT (*) FROM 
(SELECT t1.neighborA 
     FROM (
          SELECT called_party AS neighborA FROM monthly_connections 
          WHERE calling_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1' 
          UNION SELECT calling_party AS neighborA FROM monthly_connections
                WHERE called_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1') t1                  
          INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections 
                      WHERE calling_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2'
                      UNION SELECT calling_party AS neighborB FROM monthly_connections
                            WHERE called_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2') t2 ON t1.neighborA = t2.neighborB) t3

The query above calculates the number of common neighbors of users 10D149A4356E1AA3A8AF604BD992BBA141DB53D2 and 9F7334BCF9000CD68D40302DC4801E60C027A7D1

The goal is to write the query to set the value of column common neighbors and neighborhood overlap for each pair of connection in the table

Does anyone know how to write the query to update columns common_neighbors and neighborhood_overlap?

For common neighbors I started to write the following query but it is not correct:

 UPDATE mc SET
    common_neighbors = 
    (SELECT COUNT (*) FROM 
(SELECT t1.neighborA FROM (SELECT called_party AS neighborA FROM monthly_connections WHERE calling_party = mc.calling_party UNION SELECT calling_party AS neighborA FROM monthly_connections WHERE called_party = mc.calling_party) t1 INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections WHERE calling_party = mc.called_party UNION SELECT calling_party AS neighborB FROM monthly_connections WHERE called_party = mc.called_party) t2 ON t1.neighborA = t2.neighborB) t3) FROM monthly_connections mc INNER JOIN t3 ON t3.calling_party = mc.calling_party AND t3.called_party = mc.called_party


I think this query works (though might not be that performant).

UPDATE mc 
   SET common_neighbors = (SELECT COUNT (*) FROM
     (
      (SELECT called_party FROM monthly_connections 
        WHERE calling_party = mc.calling_party
         UNION
       SELECT calling_party FROM monthly_connections 
        WHERE called_party = mc.calling_party
      )
        INTERSECT
      (SELECT calling_party FROM monthly_connections 
       WHERE called_party = mc.called_party
         UNION
       SELECT called_party FROM monthly_connections 
        WHERE calling_party = mc.called_party
      )
     ) t1
   ) FROM monthly_connections mc
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜