开发者

Finding Unique Rows that Fail to Join in MySQL?

I have two tables, one is called "keywords"; this table simply stores keywords as a unique keyword ID, and the text开发者_StackOverflow社区 of the keyword. The other is called "keylinks"; this table stores rows linking a media ID to a keyword ID.

If I had a media item, and I wanted to get all the keywords for that media item, I would use the following code:

   SELECT keywords.*, keylinks.*
     FROM keywords
LEFT JOIN keylinks ON (keylinks.keyword_id = keywords.keyword_id)
    WHERE keylinks.media_id = ?

What if I wanted to do the opposite?

Instead of getting the keywords that match a media ID, I would like to get the keywords that DON'T match a media ID. How would I do this? I cant simply use WHERE keylinks.media_id != ? because that would return thousands of rows of keylink entries that don't relate to that specific media ID, which may in fact be matching keywords.


There's at least three means of doing this. ANSI provides EXCEPT, which doesn't appear to be supported by MySQL at this time.

LEFT JOIN/IS NULL

The placement of criteria with an OUTER JOIN is crucial - if in the WHERE clause, the criteria is applied after the JOIN. If the criteria is in the JOIN, the criteria is applied before the JOIN.

   SELECT k.*
     FROM KEYWORDS k
LEFT JOIN KEYLINKS kl ON kl.keyword_id = k.keyword_id 
                     AND.media_id = ?
    WHERE keylinks.media_id IS NULL

NOT EXISTS

SELECT k.*
  FROM KEYWORDS k
 WHERE NOT EXISTS (SELECT NULL 
                     FROM KEYLINKS kl 
                    WHERE kl.keyword_id = k.keyword_id
                      AND kl.media_id = ?)

NOT IN

SELECT k.*
  FROM KEYWORDS k
 WHERE k.keyword_id NOT IN (SELECT kl.keyword_id
                              FROM KEYLINKS kl
                             WHERE kl.media_id = ?)

Which Performs Best?

It depends on if the columns compared can be nullable (the value could be NULL) or not.

  • If the values are not nullable, the LEFT JOIN / IS NULL is the fastest means on MySQL only.
  • Otherwise, if the columns are nullable -- NOT EXISTS/NOT IN are the most efficient.


SELECT K1.keyword_id 
  FROM keywords AS K1
 WHERE 
   NOT EXISTS (SELECT * 
                 FROM keylinks AS K2 
                WHERE K2.keyword_id = K1.keyword_id 
                  AND K2.media_id = %d);

This will give you all keywords for which there are no keylinks in the database.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜