SQL Order by Number of Common values

2023-02-12 15:48 问答作者：

I have a table setup as follows:

+---------------+
| resources     |
+---------------+
| id            |
| name          |
+---------------+
+-------------------+
| resource_subjects |
+-------------------+
| resource_id       |
| subject_id        |
+-------------------+

What I need to do is construct a query which finds the number of shared subjects between two resources.

So with a resource_subjects table like this:

+---------------------------+
| resource_id | subject_id  |
+---------------------------+
| resource1   | 1           |
| resource1   | 2           |
| resource1   | 3           |
| resource1   | 4           |
| resource1   | 5           |
| resource2   | 1           |
| resource2   | 2           |
| resource2   | 3           |
| resource3   | 1           |
| resource3   | 4           |
+---------------------------+

I'd like this query to give me 开发者_C百科something like:

+----------------------------------------------------------+
| first_resource | second_resource | shared_subjects_count |
+----------------------------------------------------------+
| resource1      | resource2       | 3                     |
| resource1      | resource3       | 2                     |
| resource2      | resource3       | 1                     |
+----------------------------------------------------------+

To get the idea across, the pseudo-code in my head would read something like:

SELECT id AS first_resource, id AS second_resource, COUNT(number of subjects shared between first_resource and second_resource in resource_subjects table) AS shared_subjects_count ORDER BY shared_subjects_count DESC

If anyone can provide a sample query, or even point me in the right direction that'd be amazing.

Thanks

To create a table similar to yours I used:

CREATE TABLE resource_subjects (
res_id int(11),
sub_id int(11)
);

INSERT INTO resource_subjects VALUES 
(1,1),
(1,2),
(1,3),
(1,4),
(1,5),
(2,1),
(2,2),
(2,3),
(3,1),
(3,4);

And then the query you can use is:

SELECT t2.res_id 'first', t1.res_id 'second', COUNT(t1.sub_id)
FROM resource_subjects t1
JOIN resource_subjects t2 ON t1.res_id > t2.res_id AND t1.sub_id = t2.sub_id 
GROUP BY 1,2

Note that my resource_id is a real id (integer) and not a string, which allows the bigger than condition in the ON clause.

This seems to work in MySql:

select
  a.resource_id as first_resource,
  b.resource_id as second_resource,
  count(*) as shared_subjects_count

from
  resource_subjects as a,
  resource_subjects as b

where
  a.resource_id < b.resource_id
  and a.subject_id = b.subject_id
  group by a.resource_id,b.resource_id;

继续阅读：sql

SQL Order by Number of Common values

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？