How to 'add' a column to a query result while the query contains aggregate function?
I have a table named 'Attendance' which is used to record student attendance time in courses. This table has 4 columns, say 'id', 'course_id', 'attendance_time', and 'student_name'. An example of few records in this table is:
23 100 1/1/2010 10:00:00 Tom
24 100 1/1/2010 10:20:00 Bob
25 187 1/2/2010 08:01:01 Lisa
.....
I want to create a summary of the latest attendance time for each course. I created a query below:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id
The result would be something like this
100 &开发者_高级运维nbsp;1/1/2010 10:20:00
187 1/2/2010 08:01:01
Now, all I want to do is add the 'id' column to the result above. How to do it?
I can't just change the command to something like this
SELECT id, course_id, max(attendance_time) FROM attendance GROUP BY id, course_id
because it would return all the records as if the aggregate function is not used. Please help me.
This is a typical 'greatest per group', 'greatest-n-per-group' or 'groupwise maximum' query that comes up on Stack Overflow almost every day. You can search Stack Overflow for these terms to find many different examples of how to solve this with different databases. One way to solve it is as follows:
SELECT
T2.course_id,
T2.attendance_time
T2.id
FROM (
SELECT
course_id,
MAX(attendance_time) AS attendance_time
FROM attendance
GROUP BY course_id
) T1
JOIN attendance T2
ON T1.course_id = T2.course_id
AND T1.attendance_time = T2.attendance_time
Note that this query can in theory return multiple rows per course_id if there are multiple rows with the same attendance_time. If that cannot happen then you don't need to worry about this issue. If this is a potential problem then you can solve this by adding an extra grouping on course_id, attendance_time and selecting the minimum or maximum id.
What do you need the additional column for? It already has a course ID, which identifies the data. A synthetic ID to the query would be useless because it does not refer to anything. If you want to get the max from the query results for a single course, then you can add a where condition like this:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id **WHERE course_id = your_id_here**;
If you mean that the column should be named 'id', you can alias it in the query:
SELECT course_id **AS id**, max(attendance_time) FROM attendance GROUP BY course_id;
You could make a view out of your query to easily access the aggregate data:
CREATE VIEW max_course_times AS SELECT course_id AS id, max(attendance_time) FROM attendance GROUP BY course_id;
SELECT * FROM max_course_times;
For SQL Server 2008 onwards, I like to use a Common Table Expression to add aggregated columns to queries:
WITH AttendanceTimes (course_id, maxTime)
AS
(
SELECT
course_id,
MAX(attendance_time)
FROM attendance
GROUP BY course_id
)
SELECT
a.course_id,
t.maxTime,
a.id
FROM attendance a
INNER JOIN AttendanceTimes t
ON a.course_id = t.course_id
精彩评论