Advice on database design / SQL for retrieving data with chronological order
I am creating a database that will help keep track of which employees have been on a certain training course. I would like to get some guidance on the best way to design the database.
Specifically, each employee must attend the training course each year and my database needs to keep a history of all the dates on which they have attend the course in the past.
The end user will use the software as a planning tool to help them book future course dates for employees. When they select a given employee they will see:
- (a) Last attendance date
- (b) Projected future attendance date(i.e. last attendance date + 1 calendar year)
In terms of my database, any given employee may have multiple past course attendance dates:
EmpName AttandanceDate
Joe Bloggs 1st Jan 2007
Joe Bloggs 4th Jan 2008
Joe Bloggs 3rd Jan 2009
Joe Bloggs 8th Jan 2010
My question is what is the best way to set up the database to make it easy to retrieve the most recent course attendan开发者_如何学Cce date? In the example above, the most recent would be 8th Jan 2010.
Is there a good way to use SQL to sort by date and pick the MAX date?
My other idea was to add a column called ‘MostRecent’ and just set this to TRUE.
EmpName AttandanceDate MostRecent
Joe Bloggs 1st Jan 2007 False
Joe Bloggs 4th Jan 2008 False
Joe Bloggs 3rd Jan 2009 False
Joe Bloggs 8th Jan 2010 True
I wondered if this would simplify the SQL i.e.
SELECT Joe Bloggs WHERE MostRecent = ‘TRUE’
Also, when the user updates a given employee’s attendance record (i.e. with latest attendance date) I could use SQL to:
- Search for the employee and set the MostRecent value to FALSE
- Add a new record with MostRecent set to TRUE?
Would anybody recommended either method over the other? Or do you have a completely different way of solving this problem?
To get the last attendance date use the group function called MAX, i.e.
SELECT MAX(AttandanceDate)
FROM course_data
WHERE employee_name = 'Joe Bloggs'
To get the max attendance date for all the employees:
SELECT employee_name, MAX(AttandanceDate)
FROM course_data
GROUP BY employee_name
ORDER BY employee_name
Query above will NOT return data for employees who haven't attended any courses. So you need to execute a different query.
SELECT A.employee_name, B.AttandanceDate
FROM employee AS A
LEFT JOIN (
SELECT employee_id, MAX(AttandanceDate) AS AttandanceDate
FROM course_data
GROUP BY employee_id
) AS B ON A.id = B.employee_id
ORDER BY A.employee_name
For employees who haven't attended any course, the query will return a NULL AttendanceDate
.
The flag is redundant. The other way how to get last attend day by employee:
select top 1 AttandanceDate
from course_data
WHERE employee_name = 'Joe Bloggs'
order by AttandanceDate desc
This may already be the case, but the output from the AttandanceDate columns makes me suspicious that that column may not be a datetime column. Most RDBMS's have some sort of date, time, and/or date time data types to use for storing this information. In which KandadaBoggu's AND OMG Ponies responses are perfect. But if you are storing your dates as strings you WILL have issues trying to do any of their suggestions.
Using a date time data type usually also opens you to the possibilites of obtaining date details like:
e.g. SELECT YEAR(2008-01-01) will return 2008 as an integer.
If you are running SQL Server 2005 or 2008 or later, you can use row_number()
do something like the following. This will list everyone, with their most recent attendance.
with temp1 as
(select *
, (row_number() over (partition by EmpName order by AttandanceDate descending))
as [CourseAttendanceOrder]
from AttendanceHistory)
select *
from temp
where CourseAttendanceOrder = 1
order by EmpName
This could be put into a view so you can use it as needed.
However, if you always will be focused on one individual at a time, it may be more efficient to make a stored procedure that can use statements like select max(AttandanceDate)
for just the person you are working on.
精彩评论