Does the following query correct the problem?
1) The following query obtains from each film category the cheapest possible DVD with the highest rating:
SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM1.CategoryId = FM2.CategoryId
AND FM2.Rating =
(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
)
)
ORDER BY FM1.CategoryId;
Query is not perfect, since film “Alien3” in category A could have a very lo开发者_如何转开发w rating, but if its DVDPrice happens to be the same as the price of a cheapest film with highest rating ( in the same category ), then “Alien3” will also be returned in a query. Does the following query correct this problem:
SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM1.CategoryId = FM2.CategoryId
AND FM1.Rating =
(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
)
)
AND FM1.Rating=(SELECT MAX(FM2.Rating)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM1.CategoryId
)
ORDER BY FM1.CategoryId;
2) I was sure that by changing "FM2.Rating=" to "FM1.Rating=" that the query would also produce the correct results, but it doesn’t. Any idea why it doesn’t work?
SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM1.CategoryId = FM2.CategoryId
AND FM1.Rating =
(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
)
)
ORDER BY FM1.CategoryId;
thanx
EDIT - REPLYING TO MR. Bill Karwin
If we insert into Films table the following rows:
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 1, 'The Dirty Half Dozen', 1987, 'Six men go to war wearing unwashed uniforms. The horror!', 'N', 2, 4, NULL );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, 'On Golden Puddle', 1967, 'A couple find love while wading through a puddle', 'Y', 4, 2, 12.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 3, 'The Lion, the Witch, and the Chest of Drawers', 1977, 'A fun film for all those interested in zoo/magic/furniture drama', 'N', 1, 3, NULL );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 4, 'Nightmare on Oak Street, Part 23', 1997, 'The murderous Terry stalks Oak Street', 'Y', 2, 3, 9.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 5, 'The Wide Brimmed Hat', 2005, 'Fascinating life story of a wide brimmed hat', 'N', 1, 5, NULL );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 6, 'Sense and Insensitivity', 2001, 'She longs for a new life with Mr Arcy, he longs for a small cottage in the Hamptons', 'Y', 3, 6, 15.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 7, 'Planet of the Japes', 1967, 'Earth has been destroyed, to be taken over by a species of comedians', 'Y', 5, 4, 12.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 8, 'The Maltese Poodle', 1947, 'A mysterious bite mark, a guilty looking poodle. 1st class thriller', 'Y', 1, 1, 2.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, '15th Late Afternoon', 1989, 'One of Shakespeare''s lesser known plays', 'N', 5, 6, 12.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, 'Soylent Yellow', 1967, 'Detective Billy Brambles discovers Soylent Yellow is made of Soya Bean. Ewwww!', 'Y', 5, 5, 12.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 16, 'First', 1967, '', 'Y', 1, 2, 12.99 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 17, 'Second', 1967, '', 'Y', 7, 2, 100 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 19, 'Third', 1967, '', 'Y', 7, 2, 10 );
INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 20, 'Fourth', 1967, '', 'Y', 1, 2, 10 );
Then my second query ( the one you've said that it works ) returns the following films:
• The Maltese Poodle • Third Nightmare on Oak Street • Planet of the Japes • Soylent Yellow • 15th Late Afternoonwhile yours returns
• The Dirty Half Dozen • The Lion • the Witch and the Chest of Drawers • Nightmare on Oak Street • The Wide Brimmed Hat • Planet of the Japes • The Maltese Poodle • 15th Late Afternoon • Soylent Yellow and ThirdEDIT - REPLYING TO MR. Russell Steen
SELECT * FROM Films INNER JOIN
(SELECT Min(DVDPrice) as DVDPrice, MaxRating, x1.CategoryId, x1.Category FROM
(SELECT FilmName, DVDPrice, Rating, MaxRating, Category, Films.CategoryId FROM Films INNER JOIN
(SELECT MAX(Rating) as MaxRating, FM1.CategoryId, C1.Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
GROUP BY Category, FM1.CategoryId
) x on Films.Rating = x.MaxRating and Films.CategoryId = x.CategoryID
) x1
WHERE DVDPrice IS NOT NULL
GROUP BY CategoryId,Category
) y on Films.Rating = y.MaxRating and Films.CategoryId = y.CategoryId and Films.DVDPrice = y.DVDPrice
I’ve rewrote your query best I could, but I’m not so sure whether it produces the correct results and truth be told, I’ve got completely lost on whether particular inner queries should also select FilmID columns etc.
EDIT - SECOND REPLY TO MR. Bill Karwin
SELECT f.FilmName, f.Rating, f.DVDPrice, f.CategoryId
FROM Films f
LEFT OUTER JOIN Films p ON (f.CategoryId = p.CategoryId
AND p.AvailableOnDvd = 'Y' AND f.AvailableOnDvd = 'Y' AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r ON (f.CategoryId = r.CategoryId
AND r.AvailableOnDvd = 'Y' AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE p.CategoryId IS NULL AND r.CategoryId IS NULL
ORDER BY f.CategoryId;
First let's state the problem. We want all the films with the highest rating for each category. Then, of those, we want the lowest price.
First get the highest ratings
SELECT * FROM Films
INNER JOIN
(SELECT Max(Rating) as Rating, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
GROUP BY Category
) x on Films.Rating = x.Rating and Films.Category = x.Category
Now, from that, get the cheapest price
SELECT * FROM Films INNER JOIN
(SELECT Min(DVDPrice), x.Rating, Category FROM
(SELECT * FROM Films INNER JOIN
(SELECT MAX(Rating) as Rating, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
GROUP BY Category
) x on Films.Rating = x.Rating and Films.Category = x.Category
)
WHERE DVDPrice IS NOT NULL
GROUP BY Category, DVDPrice
) y on Films.Rating = y.Rating and Films.Category = y.Category and Films.DVDRating = y.DVDRating
what you want is:
-----------------
for each category, retrieve a film that meets the following 2 conditions:
_condition1:
____rating= max rating in that category
_
_condition2:
____price= min price in that category for films verifying condition 1
_
-->
in other terms it's equivalent to order films by Rating Desc then DVDPrice Asc for each category and take the first one.
1 solution is:
SELECT FilmName, Rating, DVDPrice, Category
FROM Films FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.FilmId = (SELECT TOP 1 FilmId
FROM Films AS FM2
WHERE FM2.CategoryId = FM1.CategoryId
ORDER BY Rating DESC, DVDPrice)
OR:
SELECT FM.FilmName, FM.Rating, FM.DVDPrice, C1.Category
FROM (SELECT FM0.*, ROW_NUMBER() over (ORDER BY Rating DESC, DVDPrice) rank
FROM Films FM0) FM
INNER JOIN Category AS C1 ON C1.CategoryId = FM.CategoryId
INNER JOIN (SELECT FM1.CategoryId, MIN(FM1.rank) rank
FROM (SELECT CategoryId,
ROW_NUMBER() over (ORDER BY Rating DESC,DVDPrice) rank
FROM Films) AS FM1
GROUP BY CategoryId) FM2
ON FM.CategoryId = FM2.CategoryId
AND FM.rank = FM2.rank
with your data, I've done some tests and it seems that the following query is better than the 2 above:
SELECT FM.*, C.Category
FROM (SELECT FM1.CategoryId, MAX(FM1.FilmId) FilmId
FROM Films FM1
WHERE NOT EXISTS (SELECT NULL
FROM Films AS FM2
WHERE FM2.CategoryId = FM1.CategoryId
AND (FM1.Rating < FM2.Rating
OR ( FM1.Rating = FM2.Rating
AND FM1.DVDPrice > FM2.DVDPrice)
)
)
GROUP BY FM1.CategoryId) FF
INNER JOIN Films FM on FM.FilmId = FF.FilmId
AND FM.CategoryId = FF.CategoryId
INNER JOIN Category AS C1 ON C1.CategoryId = FM.CategoryId
1) Yes, the second query you give looks better. But I give +1 to @Russell Steen's solution because it avoids the use of correlated subqueries.
This is a variation of the greatest-n-per-group problem that I see frequently on SO. Here's another possible solution:
SELECT f.*
FROM Films f
LEFT OUTER JOIN Films p
ON (f.CategoryId = p.CategoryId AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r
ON (f.CategoryId = r.CategoryId AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE p.CategoryId IS NULL AND r.CategoryId IS NULL;
The explanation is that we try to find a film "p
" in the same category, with a lower price. When we have found none, p.*
will be NULL because that's how outer joins work. When there are no dvd's with a lower price, we've found the one with lowest price.
We further try the same trick to find a film "r
" with the highest rating. This time we restrict to films in the same category and with the same price (that is, the lowest price) as the film f
. Otherwise we'd unintentionally find the film with the highest rating in the category, even if it isn't cheap.
You can also reverse the order of the joins, first finding the highest rating and then finding the lowest price among those with the higest rating. It depends on what you place at a greater priority -- low price or high rating. No matter what solution you use, you have to make a decision about this priority.
2) The other query you tried doesn't work because the condition you use in the subquery doesn't eliminate any of the wrong rows of the FT2 subquery. It's a "Green Eggs and Ham" problem: whether on a train or on a plane, on a boat or on a goat, you've still got green eggs and ham included in the meal.
update: Okay, thanks for the sample data. When you first asked the question, you didn't include the information that some films could be ineligible because they aren't available on DVD and have a NULL in the DVDPrice
column. Here's an updated query using my technique that returns the correct films, one per category, excluding films that aren't available on DVD, with the lowest price and highest rating:
SELECT f.FilmName, f.Rating, f.DVDPrice, f.CategoryId
FROM Films f
LEFT OUTER JOIN Films p ON (f.CategoryId = p.CategoryId
AND p.AvailableOnDvd = 'Y' AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r ON (f.CategoryId = r.CategoryId
AND r.AvailableOnDvd = 'Y' AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE f.AvailableOnDvd = 'Y' AND p.CategoryId IS NULL AND r.CategoryId IS NULL
ORDER BY f.CategoryId;
Output:
+-------------------------+--------+----------+------------+
| FilmName | Rating | DVDPrice | CategoryId |
+-------------------------+--------+----------+------------+
| The Maltese Poodle | 1 | 2.99 | 1 |
| Third | 7 | 10.00 | 2 |
| Nightmare on Oak Street | 2 | 9.99 | 3 |
| Planet of the Japes | 5 | 12.99 | 4 |
| Soylent Yellow | 5 | 12.99 | 5 |
| Sense and Insensitivity | 3 | 15.99 | 6 |
+-------------------------+--------+----------+------------+
This differs from your result in category 6, because Sense and Insensitivity in your sample data is the only film that is available on DVD. 15 Late Afternoon is not available, even though it has a non-null value for DVDPrice. If I change it to AvailableOnDvd='Y'
then 15 Late Afternoon is chosen instead of the other film.
Regarding your question about how did I solve this, it's a variation of a common question in SQL, which I have tagged the "greatest-n-per-group" question. You want the query to return every film f
such that no film exists with a lower DVDPrice
in the same category. I solve with an outer join to p
and if no matches are found in p
then f
must have the lowest price in that category. That's the common solution.
Your added twist in this problem is that you have another attribute to filter on. So given the film (or films in the case of ties) with the lowest price, you want the one with the highest rating. The technique is the same, to use an outer join to r
where the category and price is equal, and the rating is higher. When no such films are found with a higher rating, then f
must have the highest rating for a given category and price.
I'm going to add a tag to your question greatest-n-per-group
so you can follow it and view other SQL questions that are solved with the same technique.
精彩评论