开发者

SQL Query for all pairs of elements that are only in different groups

I have a table called Survey with a Group Column and a Subject Column

CREATE TABLE survey (
  `group` INT NOT NULL,
  `subject` VARCHAR(16) NOT NULL,
  UNIQUE INDEX (`group`, `subject`)
);

INSERT INTO survey 
  VALUES
  (1, 'sports'),
  (1, 'history'),
  (2, 'art'),
  (2, 'music'),
  (3, 'math'),
  (3, 'sports'),
  (3, 'science')
;

I am trying to figure out a query that wi开发者_如何学编程ll return all pairs of subjects that are not part of the same group. So from my above example, I would like to see these pairs returned in a table:

science - history  
science - art  
science - music  
history - math  
sports  - art  
sports  - music  
history - art  
history - music

Thus, the query shouldn't return:

sports - history  

as an example since they are both in Group 1.

Thanks so much.


SELECT s1.subject,
       s2.subject
FROM   survey s1
       JOIN survey s2
         ON s1.subject < s2.subject
GROUP  BY s1.subject,
          s2.subject
HAVING COUNT(CASE
               WHEN s1.groupid = s2.groupid THEN 1
             END) = 0   


Sample table

create table Survey(groupid int, subject varchar(100))
insert into Survey select
1, 'sports' union all select
1, 'history' union all select
2, 'art' union all select
2, 'music' union all select
3, 'math' union all select
3, 'sports' union all select
3, 'science'

The ANSI-compliant query, which works for all mainstream RDBMS

select a.subject, b.subject
from (select distinct subject from Survey) A
inner join (select distinct subject from Survey) B on B.subject > A.subject
left join Survey C on C.subject = A.subject
left join Survey D on D.subject = B.subject and D.groupid = C.groupid
where D.groupid is null
order by a.subject, b.subject


Here's a slightly different approach:

SELECT *
FROM (SELECT DISTINCT subject FROM yourtable) AS T1
JOIN (SELECT DISTINCT subject FROM yourtable) AS T2
ON T1.subject < T2.subject
WHERE NOT EXISTS
(
    SELECT *
    FROM yourtable T3
    JOIN yourtable T4
    ON T3.id = T4.id
    WHERE T1.subject = T3.subject
    AND T2.subject = T4.subject
)
ORDER BY t1.subject, t2.subject;


The standard way would be to use MINUS to get the complement of all pairs that are in the same group, but MySQL doesn't support MINUS. For MySQL, you can transform a MINUS into a statement based on the NOT IN operator and a sub-query:

SELECT s1.subject, s2.subject
  FROM survey AS s1
    JOIN survey AS s2
  WHERE (s1.subject, s2.subject) NOT IN
  (
    SELECT s1.subject, s2.subject
      FROM survey AS s1
        JOIN survey AS s2
          ON s1.group = s2.group
  )
;

Note that this can produce duplicates. If you don't want them, use SELECT DISTINCT.

With indices and the sample data, the extended query plan is:

+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+
| id | select_type        | table | type   | possible_keys | key   | key_len | ref                | rows | filtered | Extra                                       |
+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+
|  1 | PRIMARY            | s1    | index  | NULL          | group | 54      | NULL               |    7 |   100.00 | Using index; Using temporary                |
|  1 | PRIMARY            | s2    | index  | NULL          | group | 54      | NULL               |    7 |   100.00 | Using where; Using index; Using join buffer |
|  2 | DEPENDENT SUBQUERY | s1    | index  | group         | group | 54      | NULL               |    7 |    85.71 | Using where; Using index                    |
|  2 | DEPENDENT SUBQUERY | s2    | eq_ref | group         | group | 54      | test.s1.group,func |    1 |   100.00 | Using where; Using index                    |
+----+--------------------+-------+--------+---------------+-------+---------+--------------------+------+----------+---------------------------------------------+


Select S1.Subject As LeftSubject
    , S2.Subject As RightSubject
From SourceData As S1
    Join SourceData As S2
        On S2.subject > S1.subject
    Left Join   (
                Select S1.groupid
                    , S1.subject As LeftSubject
                    , S2.subject As RightSubject
                From SourceData As S1
                    Join SourceData As S2
                        On S2.groupid = S1.groupid
                            And S2.subject > S1.subject
                ) As Z
        On Z.groupid = S1.groupid
            And Z.LeftSubject = S1.subject
            And Z.RightSubject = S2.subject
Where Z.groupid is null

Another variant using outis' tuple format:

Select S1.Subject As LeftSubject
    , S2.Subject As RightSubject
From SourceData As S1
    Join SourceData As S2
        On S2.subject > S1.subject
Where (S1.groupid, S1.subject, S2.subject) Not In (
                                                Select S1.groupid
                                                    , S1.subject
                                                    , S2.subject
                                                From SourceData As S1
                                                    Join SourceData As S2
                                                        On S2.groupid = S1.groupid
                                                Where S2.subject > S1.subject
                                                )
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜