Exclude records from one table based on count in joined table
I have two tables: Coupons and Respon开发者_如何学Pythonses.
The Coupons table includes fields such as:
- VendorID [key]
- CouponID [key]
- CouponDescription
The Responses table includes fields such as:
- RespondentID [key]
- CouponID [key]
- (VendorID not repeated in this table.)
- Accepted [true/false]
- Rejected [true/false]
- Rating [1 to 5]
When someone accepts a coupon, they eventually rate it in the Responses table. Likewise, when someone rejects a coupon, the rejection appears in the Responses table.
When fetching a list of available coupons from the Coupons table, I'd like to exclude vendors who previously received two or more rejections from a respondent. For instance if RespondentID 3 previously rejected two or more coupons from Vendor 47, all coupons from Vendor 47 would no longer be shown to RespondentID 3.
Two things make it difficult for a SQL novice like myself.
The first is how to write the subquery that counts rejections from a particular vendor and by a particular respondent.
The second is how to join-back (so to speak) the Responses table to the Coupons table so that the sub-query of the Responses table bases its result on the RespondentID (which is in the Responses table) and the VendorID (which is not in the Responses table and must be determined based on the CouponID (which is in both tables).
Thanks for your assistance. It's always appreciated.
Could be something like this:
SELECT * FROM Coupons
WHERE VendorId NOT IN (
SELECT C.VendorId
FROM Responses R JOIN Coupons C ON R.CouponId = C.CouponId
WHERE R.RespondentID = 3 AND R.Rejected = True
GROUP BY C.VendorId
HAVING SUM(R.Rejected) > 2
)
I've never used HAVING
without including the value it in the SELECT
but I think it could work. Also, not sure about the data type of Rejected, probably SUM() won't work.
OK, this is a long shot and I'm not sure this query will actually run, but I think the key is that you have to include the Coupon table twice. Try this (I think, however, that where
is not allowed like this):
select c.CouponID, c.CouponDescription,
(select count(r.CouponID)
from Responses r
inner join Coupons c2 on c2.CouponID = r.CouponID
where r.RespondentID = 3
and r.Rejected = true
and c2.VendorID = c.VendorID) as countRejections
from Coupons c
where countRejections < 2
Here's what I came up with...
SELECT
*
FROM Coupons C
WHERE C.OriginatorID NOT IN (
SELECT
DISTINCT C.OriginatorID
FROM Responses R
INNER JOIN Coupons C ON C.CouponID = R.CouponID
WHERE C.OriginatorID IN (
SELECT Originators FROM (
SELECT C.OriginatorID As Originators, Sum(R.Rejected) AS Rejections
FROM Responses R INNER JOIN Coupons C ON C.CouponID = R.CouponID
WHERE R.RespondentID = 1
GROUP BY C.OriginatorID
) AS RejectionTally
WHERE Rejections > 1
)
)
It has the disadvantage of an additional embedded SELECT, but it appears that most of the performance hit occurs the moment I attempt to exclude particular OriginatorIDs from the Coupons table anyhow. For example, manually entering an exclusion, such as...
SELECT * FROM Coupons WHERE OriginatorID <> 10
has the same effect. On a test table of 50 rows, the processing time is .27s as opposed to an unconstrained query that has a processing time of .08s.
The extra embedding is due to my needing a single column for the outer NOT IN (i.e., the OriginatorID's that I'm seeking to exclude). That meant I had to first create the SUMs, then isolate the records whose SUM was less than the criterion (1), and then exclude those records.
精彩评论