Grouping by sliding window of 60 minutes
I have a table with information about jobs in it: the job type, the start time, and the end time.
I would like a report that shows me how many jobs ran in each hour - but not clock-hours but 60 minute off-set hours from the time of the first job in that group (they are batches of jobs, so I know for certain that the job-types won't overlap in time).
Something like this:
num_of_jobs | job_ty开发者_JAVA百科pe | hour ----------------------------- 1254 | B | 2011-08-22 13:47 9983 | B | 2011-08-22 14:47 9072 | B | 2011-08-22 15:47 20309 | B | TOTAL 79 | C | 2011-08-22 16:02 105 | C | 2011-08-22 17:02 184 | C | TOTAL 10234 | D | 2011-08-22 17:29 9882 | D | 2011-08-22 18:29 20116 | D | TOTAL
If it was OK to group by Hours (such as 12:00, 13:00, 14:00) I can do this easily enough like this (very simplified version of actual query):
select count(job_id) number_of_jobs, job_type,
case when to_char(end_date,'YYYY-MM-DD HH24') is not null
then to_char(end_date,'YYYY-MM-DD HH24')||':00'
else 'TOTAL'
end Date_and_hour
from my_jobs
where end_date is not null
group by rollup (to_char(end_date,'YYYY-MM-DD HH24')) , job_type
order by to_char(end_date,'YYYY-MM-DD HH24') asc, job_type asc;
Except it groups on the Hours, not on the smallest start time in a batch based on job_type.
I'm really not quite sure how to get this working.
(using Oracle 10g)
It seems to me that the real issue is in finding which hour-group each row belongs to. Once you have that, you can plug it into the query you provided to get your final answer. To get the hour-group, all we have to do is figure out how many hours have elapse since the earliest time in the group:
select floor((end_date - min(end_date)
over (partition by job_type))*24)+1 as hour_group
from my_jobs
where end_date is not null
I'm using trunc
because we only care about whole hours and adding one so that we start counting at one, rather than zero. Since this solution uses analytic functions, you'll need to put it in a subquery before grouping.
Playing with it a little, I get the following (untested) query in the end:
SELECT COUNT(job_id) number_of_jobs,
job_type,
min_time + FLOOR((end_date - min_time) * 24) / 24 AS date_and_hour
FROM (SELECT job_id,
job_type,
end_date,
MIN(end_date) OVER (PARTITION BY job_type) AS min_time
FROM my_jobs
WHERE end_date IS NOT NULL)
GROUP BY ROLLUP(min_time + FLOOR((end_date - min_time) * 24) / 24), job_type
ORDER BY hour_group ASC, job_type ASC;
The principle is the same as my initial answer, I just moved the math around a little.
精彩评论