SQL Query for all pairs of elements that are only in different groups

2023-02-15 17:02 问答作者：

I have a table called Survey with a Group Column and a Subject Column

CREATE TABLE survey (
  `group` INT NOT NULL,
  `subject` VARCHAR(16) NOT NULL,
  UNIQUE INDEX (`group`, `subject`)
);

INSERT INTO survey 
  VALUES
  (1, 'sports'),
  (1, 'history'),
  (2, 'art'),
  (2, 'music'),
  (3, 'math'),
  (3, 'sports'),
  (3, 'science')
;

I am trying to figure out a query that wi开发者_如何学编程ll return all pairs of subjects that are not part of the same group. So from my above example, I would like to see these pairs returned in a table:

science - history  
science - art  
science - music  
history - math  
sports  - art  
sports  - music  
history - art  
history - music

Thus, the query shouldn't return:

sports - history

as an example since they are both in Group 1.

Thanks so much.

SELECT s1.subject,
       s2.subject
FROM   survey s1
       JOIN survey s2
         ON s1.subject < s2.subject
GROUP  BY s1.subject,
          s2.subject
HAVING COUNT(CASE
               WHEN s1.groupid = s2.groupid THEN 1
             END) = 0

Sample table

create table Survey(groupid int, subject varchar(100))
insert into Survey select
1, 'sports' union all select
1, 'history' union all select
2, 'art' union all select
2, 'music' union all select
3, 'math' union all select
3, 'sports' union all select
3, 'science'

The ANSI-compliant query, which works for all mainstream RDBMS

select a.subject, b.subject
from (select distinct subject from Survey) A
inner join (select distinct subject from Survey) B on B.subject > A.subject
left join Survey C on C.subject = A.subject
left join Survey D on D.subject = B.subject and D.groupid = C.groupid
where D.groupid is null
order by a.subject, b.subject

Here's a slightly different approach:

SELECT *
FROM (SELECT DISTINCT subject FROM yourtable) AS T1
JOIN (SELECT DISTINCT subject FROM yourtable) AS T2
ON T1.subject < T2.subject
WHERE NOT EXISTS
(
    SELECT *
    FROM yourtable T3
    JOIN yourtable T4
    ON T3.id = T4.id
    WHERE T1.subject = T3.subject
    AND T2.subject = T4.subject
)
ORDER BY t1.subject, t2.subject;

The standard way would be to use MINUS to get the complement of all pairs that are in the same group, but MySQL doesn't support MINUS. For MySQL, you can transform a MINUS into a statement based on the NOT IN operator and a sub-query:

SELECT s1.subject, s2.subject
  FROM survey AS s1
    JOIN survey AS s2
  WHERE (s1.subject, s2.subject) NOT IN
  (
    SELECT s1.subject, s2.subject
      FROM survey AS s1
        JOIN survey AS s2
          ON s1.group = s2.group
  )
;

Note that this can produce duplicates. If you don't want them, use SELECT DISTINCT.

With indices and the sample data, the extended query plan is:

+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+
| id | select_type        | table | type   | possible_keys | key   | key_len | ref                | rows | filtered | Extra                                       |
+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+
|  1 | PRIMARY            | s1    | index  | NULL          | group | 54      | NULL               |    7 |   100.00 | Using index; Using temporary                |
|  1 | PRIMARY            | s2    | index  | NULL          | group | 54      | NULL               |    7 |   100.00 | Using where; Using index; Using join buffer |
|  2 | DEPENDENT SUBQUERY | s1    | index  | group         | group | 54      | NULL               |    7 |    85.71 | Using where; Using index                    |
|  2 | DEPENDENT SUBQUERY | s2    | eq_ref | group         | group | 54      | test.s1.group,func |    1 |   100.00 | Using where; Using index                    |
+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+

Select S1.Subject As LeftSubject
    , S2.Subject As RightSubject
From SourceData As S1
    Join SourceData As S2
        On S2.subject > S1.subject
    Left Join   (
                Select S1.groupid
                    , S1.subject As LeftSubject
                    , S2.subject As RightSubject
                From SourceData As S1
                    Join SourceData As S2
                        On S2.groupid = S1.groupid
                            And S2.subject > S1.subject
                ) As Z
        On Z.groupid = S1.groupid
            And Z.LeftSubject = S1.subject
            And Z.RightSubject = S2.subject
Where Z.groupid is null

Another variant using outis' tuple format:

Select S1.Subject As LeftSubject
    , S2.Subject As RightSubject
From SourceData As S1
    Join SourceData As S2
        On S2.subject > S1.subject
Where (S1.groupid, S1.subject, S2.subject) Not In (
                                                Select S1.groupid
                                                    , S1.subject
                                                    , S2.subject
                                                From SourceData As S1
                                                    Join SourceData As S2
                                                        On S2.groupid = S1.groupid
                                                Where S2.subject > S1.subject
                                                )

继续阅读：sql

SQL Query for all pairs of elements that are only in different groups

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？