开发者

SQL: many-to-many relationship, IN condition

I have a table called transactions with a many-to-many relationship to items through the items_transactions table.

I want to do something like this:

SELECT "transactions".* 
  FROM "transactions" 
INNER JOIN "items_transactions" 
        ON "items_transactions".transaction_id = "transactions".id 
开发者_如何转开发INNER JOIN "items" 
        ON "items".id = "items_transactions".item_id 
WHERE (items.id IN (<list of items>))

But this gives me all transactions that have one or more of the items in the list associated with it and I only want it to give me the transactions that are associated with all of those items.

Any help would be appreciated.


You have to expand out your query for all of the items in the list:

SELECT "transactions".* 
FROM "transactions" 
WHERE EXISTS (SELECT 1 FROM "items_transactions"
              INNER JOIN "items" ON "items".id = "items_transactions".item_id 
              WHERE "items_transactions".transaction_id = "transactions".id
              AND "items".id = <first item in list>)
AND   EXISTS (SELECT 1 FROM "items_transactions"
              INNER JOIN "items" ON "items".id = "items_transactions".item_id 
              WHERE "items_transactions".transaction_id = "transactions".id
              AND "items".id = <second item in list>)
...

You might also be able to massage it out using IN and COUNT DISTINCT, I'm not sure which would be faster. Something like (completely untested):

SELECT "transactions".* 
FROM "transactions" 
INNER JOIN (SELECT "items_transactions".transaction_id 
            FROM "items_transactions"
            INNER JOIN "items" ON "items".id = "items_transactions".item_id 
            WHERE "items".id IN (<list of items>)
            GROUP BY "items_transactions".transaction_id
            HAVING COUNT(DISTINCT "items".id) = <count of items in list>) matches ON transactions.transaction_id = matches.transaction_id


I think this does what you want.

I would put the list of items you need in to a table (temp one will be fine) and join on to that. Then count the number of distinct items and match the count to the item transactions count.

I've provided the sample DDL & Data that I used.

Create table #trans
(
transId int identity(1,1),
trans varchar(10)
)

Create Table #itemTrans
(
transId int,
itemId int
)

Create table #items
(
itemId int identity(1,1),
item varchar(10)
)

Create table #itemsToSelect
(
itemId int
)


Insert Into #trans
Values ('Trans 1')

Insert Into #trans
Values ('Trans 2')

Insert Into #trans
Values ('Trans 3')


Insert Into #Items
Values ('Item 1')

Insert Into #Items
Values ('Item 2')

Insert Into #Items
Values ('Item 3')

Insert Into #Items
Values ('Item 4')

Insert Into #itemTrans
Values (1, 1)

Insert Into #itemTrans
Values (1, 2)

Insert Into #itemTrans
Values (1, 3)

Insert Into #itemTrans
Values (2, 1)

Insert Into #itemTrans
Values (2, 3)

Insert Into #itemTrans
Values (3, 4)



Insert Into #itemsToSelect
Values (1)
Insert Into #itemsToSelect
Values (2)
Insert Into #itemsToSelect
Values (3)


Select t.transId

From #items i 
Join #itemTrans it on i.itemId = it.itemId
Join #trans t on it.transId = t.transId

Join #itemsToSelect its on it.ItemId = its.ItemId

Where it.TransId is not null
Group by t.transId
Having count(distinct(it.itemId)) = (Select count(distinct(itemId)) from #itemsToSelect)


SELECT transactions.*
WHERE (SELECT count(*)
       FROM items_transactions
       WHERE items_transactions.transaction_id = transactions.transaction_id
             AND items_transactions.item_id IN (<list of items>)
      ) = <number of items>

Although this will probably do a scan of transactions, nesting the correlated subquery for each one... not particularly efficient, so maybe:

SELECT transactions.*
WHERE EXISTS (SELECT 1 FROM items_transactions
              WHERE items_transactions.transaction_id = transactions.transaction_id
              AND items_transactions.item_id IN (<list of items>)
      )
      AND
      (SELECT count(*)
       FROM items_transactions
       WHERE items_transactions.transaction_id = transactions.transaction_id
             AND items_transactions.item_id IN (<list of items>)
      ) = <number of items>

or something similar to persuade the DB to find transactions related to at least one of the items first, and then check each transaction is linked against all the items later.

As someone's noted, you can also simply generate join clauses for each item instead, which might well be better if the number of items isn't large.


I haven't executed this, but that should get you the result you want:

SELECT t.* FROM items i
    INNER JOIN items_transactions it ON i.id = it.item_id
        INNER JOIN transactions t ON it.transaction_id = t.id
WHERE i.id IN (1,2,3)


The final bit of the query looks wrong:

WHERE (items.id IN (<list of items>))

the 'in' statement is like a big OR statement rather than an AND statement, so it is expanded by the optimizer as:

WHERE (items.id = 123 OR items.id = 456 OR items.id = 789)

EDIT

I reckon you need to perform a correlated subquery on the items table.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜