How do I query time-period values for continuous data?
I've got records that indicate when a job starts and when it ends. The End time is not recorded until the job ends, but the start time exists from the time the job starts. What I want is to know how many jobs were running in a given time period.
declare @data table (
JobId INT IDENTITY(1,1),
StartedAt DATETIME NOT NULL,
EndedAt DATETIME NULL
)
insert into @data (StartedAt, EndedAt)
select '1 Jan 2010 8:00', '1 Jan 2010 8:30'
union select '1 Jan 2010 8:00', '1 Jan 2010 9:00'
union select '1 Jan 2010 8:00', '1 Jan 2010 9:20'
union select '1 Ja开发者_C百科n 2010 9:00', '1 Jan 2010 9:20'
union select '1 Jan 2010 9:10', NULL
Given the above, how would I query the number of jobs running in each hour? I would expect the results to indicate that there were 3 jobs running in the 8:00 through 8:59 time period, and 4 running in the 9:00 through 9:59 period, as follows:
Time period Jobs 08:00..08:59 3 09:00..09:59 4
I can modify the schema and I have some influence over how data is recorded if that would make the query simpler or more efficient.Blockquote
Without spending huge amounts of time on the problem, I'd suggest the simplest solution would be to create a table of days, and a table of hours (then you have a table of hours and days by creating a cartesian join) e.g.
days (ondate not null) slots (start_time time not null, end_time no_null)
then identifying the number of jobs which had a start or end time within those slots:
SELECT days.ondate, slots.start_time, slots.end_time, (
SELECT COUNT(*) FROM @data
WHERE (@data.start_time BETWEEN
STR_TO_DATE(
CONCATENATE(DATE_FORMAT(days.ondate, 'YYYYMMDD'),
DATE_FORMAT(slots.start_time,'HH24MISS')),
'YYYYMMDDHH25MISS')
AND
STR_TO_DATE(
CONCATENATE(DATE_FORMAT(days.ondate, 'YYYYMMDD'),
DATE_FORMAT(slots.end_time,'HH24MISS')),
'YYYYMMDDHH25MISS')
) OR (@data.start_time BETWEEN
CONCATENATE(DATE_FORMAT(days.ondate, 'YYYYMMDD'),
DATE_FORMAT(slots.start_time,'HH24MISS')),
'YYYYMMDDHH25MISS')
AND
STR_TO_DATE(
CONCATENATE(DATE_FORMAT(days.ondate, 'YYYYMMDD'),
DATE_FORMAT(slots.end_time,'HH24MISS')),
'YYYYMMDDHH25MISS')
) AS jobs_running
FROM days, slots
;
Of course this is going to get very inefficient very quickly - so you might want to think about pre-aggregation.
精彩评论