开发者

SQL query not returning unique results. Which type of join do I need to use?

I'm trying to run the following MySQL query:

SELECT * 
FROM user u 
JOIN user_categories uc ON u.user_id = uc.user_id 
WHERE (uc.category_id = 3 OR uc.category_id = 1)

It currently returns:

Joe,Smith,606开发者_如何转开发57,male
Joe,Smith,60657,male
Mickey,Mouse,60613,female
Petter,Pan,60625,male
Petter,Pan,60625,male
Donald,Duck,60615,male

If the user belongs to both categories it currently returns them twice. How can I return the user only once without using SELECT DISTINCT, regardless of how many categories they belong to?


You need a semi join. This can be achieved with a sub query.

SELECT * 
FROM user u 
WHERE EXISTS(SELECT * 
       FROM user_categories uc 
       WHERE u.user_id = uc.user_id AND  
       uc.category_id IN(1,3))

In MySQL the performance of sub queries is quite problematic however so a JOIN and duplicate elimination via DISTINCT or GROUP BY may perform better.


I don't know about MySQL, but in Postgres you may get better performance in the semi-join version from

SELECT * FROM user u 
WHERE u.user_id 
IN (SELECT user_id FROM user_categories uc WHERE uc.category_id IN (1,3));

I would expect SELECT DISTINCT to run fastest but I have learned my expectations and DB performance are often much different!


Try using a GROUP BY

SELECT * FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE uc.category_id = 3 OR uc.category_id = 1
GROUP BY u.user_id
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜