sql selection from one-to-many table
I have got 3 tables with those columns below:
Top开发者_C百科ics: [TopicID] [TopicName]
Messages: [MessageID] [MessageText]
MessageTopicRelations [EntryID] [MessageID] [TopicID]
messages can be about more than one topic. question is: given couple of topics, I need to get messages which are about ALL these topics and not the less, but they can be about some other topic too. a message which is about SOME of these given topics won't be included. I hope I explained my request well. otherwise I can provide sample data. thanks
The following use x
, y
, and z
to stand in for topic ids, being that none were provided for examples.
Using JOINs:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS tx ON tx.topicid = mtr.topicid
AND tx.topicid = x
JOIN TOPICS ty ON ty.topicid = mtr.topicid
AND ty.topicid = y
JOIN TOPICS tz ON tz.topicid = mtr.topicid
AND tz.topicid = z
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS t ON t.topicid = mtr.topicid
WHERE t.topicid IN (x, y, z)
GROUP BY m.messageid, m.messagetext
HAVING COUNT(*) = 3
Of the two, the JOIN approach is safer.
The GROUP BY/HAVING relies on the MESSAGETOPICRELATIONS.TOPICID
being either part of the primary key, or having a unique key constraint to ensure there aren't duplicates. Otherwise, you could have 2+ instances of the same topic associated to a message - which would be a false positive. Using HAVING COUNT(DISTINCT ...
would clear up any false positives, but support depends on the database - MySQL supports it at 5.1+, but not on 4.1. Oracle might, have to wait till Monday to test on SQL Server...
I looked into Bill's comment about not needing the join to the TOPICS
table:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid IN (x, y, z)
...will return false positives - rows that match at least one of the values defined in the IN
clause. And:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid = x
AND mtr.topicid = y
AND mtr.topicid = z
...won't return anything at all, because the topicid
can never be all of the values at once.
Here's a profoundly inelegant solution
SELECT
m.MessageID
,m.MessageText
FROM
Messages m
WHERE
m.MessageID IN (
SELECT
mt.MessageID
FROM
MessageTopicRelations mt
WHERE
TopicID IN (1,4,5)// List of topic IDS
GROUP BY
mt.MessageID
HAVING
count(*) = 3 //Number of topics
)
Edit: thanks to @Paul Creasey and @OMG Ponies for finding the flaws in my approach.
The correct way to do this is with a self-join for each topic; as shown in the leading answer.
Another profoundly inelegant entry:
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic1'
UNION
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic2'
...
Re: the answer by OMG Ponies, you don't need to join to the TOPICS
table. And the HAVING COUNT(DISTINCT)
clause works fine in MySQL 5.1. I just tested it.
This is what I mean:
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
WHERE mtr.topicid IN (x, y, z)
GROUP BY m.messageid
HAVING COUNT(DISTINCT mtr.topicid) = 3
The reason that I suggest COUNT(DISTINCT)
is that if the columns (messageid,topicid)
don't have a unique constraint, you could get duplicates, which would result in a count of 3 in the group, even with fewer than three distinct values.
精彩评论