SQL: many-to-many relationship, IN condition
I have a table called transactions with a many-to-many relationship to items through the items_transactions table.
I want to do something like this:
SELECT "transactions".*
FROM "transactions"
INNER JOIN "items_transactions"
ON "items_transactions".transaction_id = "transactions".id
开发者_如何转开发INNER JOIN "items"
ON "items".id = "items_transactions".item_id
WHERE (items.id IN (<list of items>))
But this gives me all transactions that have one or more of the items in the list associated with it and I only want it to give me the transactions that are associated with all of those items.
Any help would be appreciated.
You have to expand out your query for all of the items in the list:
SELECT "transactions".*
FROM "transactions"
WHERE EXISTS (SELECT 1 FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items_transactions".transaction_id = "transactions".id
AND "items".id = <first item in list>)
AND EXISTS (SELECT 1 FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items_transactions".transaction_id = "transactions".id
AND "items".id = <second item in list>)
...
You might also be able to massage it out using IN
and COUNT DISTINCT
, I'm not sure which would be faster. Something like (completely untested):
SELECT "transactions".*
FROM "transactions"
INNER JOIN (SELECT "items_transactions".transaction_id
FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items".id IN (<list of items>)
GROUP BY "items_transactions".transaction_id
HAVING COUNT(DISTINCT "items".id) = <count of items in list>) matches ON transactions.transaction_id = matches.transaction_id
I think this does what you want.
I would put the list of items you need in to a table (temp one will be fine) and join on to that. Then count the number of distinct items and match the count to the item transactions count.
I've provided the sample DDL & Data that I used.
Create table #trans
(
transId int identity(1,1),
trans varchar(10)
)
Create Table #itemTrans
(
transId int,
itemId int
)
Create table #items
(
itemId int identity(1,1),
item varchar(10)
)
Create table #itemsToSelect
(
itemId int
)
Insert Into #trans
Values ('Trans 1')
Insert Into #trans
Values ('Trans 2')
Insert Into #trans
Values ('Trans 3')
Insert Into #Items
Values ('Item 1')
Insert Into #Items
Values ('Item 2')
Insert Into #Items
Values ('Item 3')
Insert Into #Items
Values ('Item 4')
Insert Into #itemTrans
Values (1, 1)
Insert Into #itemTrans
Values (1, 2)
Insert Into #itemTrans
Values (1, 3)
Insert Into #itemTrans
Values (2, 1)
Insert Into #itemTrans
Values (2, 3)
Insert Into #itemTrans
Values (3, 4)
Insert Into #itemsToSelect
Values (1)
Insert Into #itemsToSelect
Values (2)
Insert Into #itemsToSelect
Values (3)
Select t.transId
From #items i
Join #itemTrans it on i.itemId = it.itemId
Join #trans t on it.transId = t.transId
Join #itemsToSelect its on it.ItemId = its.ItemId
Where it.TransId is not null
Group by t.transId
Having count(distinct(it.itemId)) = (Select count(distinct(itemId)) from #itemsToSelect)
SELECT transactions.*
WHERE (SELECT count(*)
FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
) = <number of items>
Although this will probably do a scan of transactions, nesting the correlated subquery for each one... not particularly efficient, so maybe:
SELECT transactions.*
WHERE EXISTS (SELECT 1 FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
)
AND
(SELECT count(*)
FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
) = <number of items>
or something similar to persuade the DB to find transactions related to at least one of the items first, and then check each transaction is linked against all the items later.
As someone's noted, you can also simply generate join clauses for each item instead, which might well be better if the number of items isn't large.
I haven't executed this, but that should get you the result you want:
SELECT t.* FROM items i
INNER JOIN items_transactions it ON i.id = it.item_id
INNER JOIN transactions t ON it.transaction_id = t.id
WHERE i.id IN (1,2,3)
The final bit of the query looks wrong:
WHERE (items.id IN (<list of items>))
the 'in' statement is like a big OR statement rather than an AND statement, so it is expanded by the optimizer as:
WHERE (items.id = 123 OR items.id = 456 OR items.id = 789)
EDIT
I reckon you need to perform a correlated subquery on the items table.
精彩评论