SQL Server sample of each type of record
I have a table full of records for various semesters, and I want a query to select the top 5 rows from each semester.
I'd like one resultset sorted by a key field (serial) and one resultset with random sorts within each semester.
The开发者_如何转开发 table looks like this:
- serial int
- semester char(4)
- grade int
Table data:
Semester|Serial|NewSerial
SP10 |1 |1001
SP10 |2 |1002
SP10 |3 |1003
SP10 |4 |1004
SP10 |5 |1005
SP10 |6 |1006
SP10 |7 |1007
SP10 |8 |1008
SP10 |9 |1009
SP10 |10 |1010
FA10 |1 |2001
FA10 |2 |2002
FA10 |3 |2003
FA10 |4 |2004
FA10 |5 |2005
FA10 |6 |2006
FA10 |7 |2007
FA10 |8 |2008
FA10 |9 |2009
FA10 |10 |2010
FA09 |1 |3001
FA09 |2 |3002
FA09 |3 |3003
FA09 |4 |3004
FA09 |5 |3005
FA09 |6 |3006
FA09 |7 |3007
FA09 |8 |3008
FA09 |9 |3009
FA09 |10 |3010
Result 1: ordered by serial, top 5
Semester|Serial|NewSerial
SP10 |1 |1001
SP10 |2 |1002
SP10 |3 |1003
SP10 |4 |1004
SP10 |5 |1005
FA10 |1 |2001
FA10 |2 |2002
FA10 |3 |2003
FA10 |4 |2004
FA10 |5 |2005
FA09 |1 |3001
FA09 |2 |3002
FA09 |3 |3003
FA09 |4 |3004
FA09 |5 |3005
Result 2: order by random within semester (sampling)
Semester|Serial|NewSerial
SP10 |3 |1003
SP10 |1 |1001
SP10 |5 |1005
SP10 |2 |1002
SP10 |4 |1004
FA10 |2 |2002
FA10 |1 |2001
FA10 |4 |2004
FA10 |3 |2003
FA10 |5 |2005
FA09 |3 |3003
FA09 |1 |3001
FA09 |2 |3002
FA09 |5 |3005
FA09 |4 |3004
these should do it:
WITH [query_semester] AS (
SELECT RN = ROW_NUMBER() OVER (PARTITION BY semester
ORDER BY serial),
serial, semester, grade
FROM SemesterTable)
SELECT serial, semester, grade
FROM [query_semester] WHERE [RN] <= 5;
and the random one:
WITH [query_semester] AS (
SELECT RN = ROW_NUMBER() OVER (PARTITION BY semester
ORDER BY NEWID()),
serial, semester, grade
FROM SemesterTable)
SELECT serial, semester, grade
FROM [query_semester] WHERE [RN] <= 5;
Show us some sample data, and expected results.
You will need to SELECT TOP 5 You will also need to GROUP BY the semester type (if that is a date so be it).
Instead of TOP 5 use HAVING count...
No idea what criteria you are using to make up the new serial, but if it just needs to be serially incrementing by semester (in the thousand position), then
select semester, serial,
newserial = (DENSE_RANK() over (order by semester) * 1000) + rn
from
( select *, rn=ROW_NUMBER() over (partition by semester order by serial)
from tbl
) x
where rn <= 5
order by semester
For the order by serial. Swap order by serial
for order by newid()
to get the random sampling.
Notes:
- SQL Server 2005+ at compatibility level 90+
- The formula for newserial works fine because each semester can only have 5 items so there won't be clashing issues
EDIT
It seems counter-intuitive in the question, but if newserial is just a regular field (it wasn't listed), then the select part becomes simply
select semester, serial, newserial
精彩评论