Is there a way to LIMIT results per group of result rows in MySQL?
I have the following query:
SELECT title, karma, DATE(date_uploaded) as d
FROM image
ORDER BY d DESC, karma DESC
This will give me a list of image records, first sorted by newest day, and then by most karma.
The开发者_如何学运维re is just one thing missing: I want to only get the x images with the highest karma per day. So for example, per day I only want the 10 most karma images. I could of course run multiple queries, one per day, and then combine the results.
I was wondering if there is a smarter way that still performs well. I guess what I am looking for is a way to use LIMIT x,y per group of results?
You can do it by emulating ROW_NUMBER using variables.
SELECT d, title, karma
FROM (
SELECT
title,
karma,
DATE(date_uploaded) AS d,
@rn := CASE WHEN @prev = UNIX_TIMESTAMP(DATE(date_uploaded))
THEN @rn + 1
ELSE 1
END AS rn,
@prev := UNIX_TIMESTAMP(DATE(date_uploaded))
FROM image, (SELECT @prev := 0, @rn := 0) AS vars
ORDER BY date_uploaded, karma DESC
) T1
WHERE rn <= 3
ORDER BY d, karma DESC
Result:
'2010-04-26', 'Title9', 9
'2010-04-27', 'Title5', 8
'2010-04-27', 'Title6', 7
'2010-04-27', 'Title7', 6
'2010-04-28', 'Title4', 4
'2010-04-28', 'Title3', 3
'2010-04-28', 'Title2', 2
Quassnoi has a good article about this which explains the technique in more details: Emulating ROW_NUMBER() in MySQL - Row sampling.
Test data:
CREATE TABLE image (title NVARCHAR(100) NOT NULL, karma INT NOT NULL, date_uploaded DATE NOT NULL);
INSERT INTO image (title, karma, date_uploaded) VALUES
('Title1', 1, '2010-04-28'),
('Title2', 2, '2010-04-28'),
('Title3', 3, '2010-04-28'),
('Title4', 4, '2010-04-28'),
('Title5', 8, '2010-04-27'),
('Title6', 7, '2010-04-27'),
('Title7', 6, '2010-04-27'),
('Title8', 5, '2010-04-27'),
('Title9', 9, '2010-04-26');
Maybe this will work:
SELECT title, karma, DATE(date_uploaded) as d FROM image img WHERE id IN ( SELECT id FROM image WHERE DATE(date_uploaded)=DATE(img.date_uploaded) ORDER BY karma DESC LIMIT 10 ) ORDER BY d DESC, karma DESC
But this is not very efficient, as you don't have an index on DATE(date_uploaded) (I don't know if that would be possible, but I guess it isn't). As the table grows this can get very CPU expensive. It might be simpler to just have a loop in your code :-).
精彩评论