Oracle Event Count Query
My SAMPLE
table has the following five columns:
sample_id (PK) (NUMBER)
sampled_on (DATE)
received_on (DATE)
complete开发者_开发百科d_on (DATE)
authorized_on (DATE)
I would like a query with one row per hour (constrained by a given date range) and five columns:
- The hour
YYYY-MM-DD HH24
- Number of samples sampled during that hour
- Number of samples received during that hour
- Number of samples completed during that hour
- Number of samples authorized during that hour
Please provide a query or at least a point in the right direction.
Reopened with bounty:
+300 reputation for the first person to incorporate Rob van Wijk's answer (single access to sample) into a view where I can efficiently query by date range (start_date/end_date
or start_date/num_days
).Try:
CREATE OR REPLACE VIEW my_view AS
WITH date_bookends AS (
SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t),
all_hours AS (
SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM date_bookends t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24)
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM all_hours h
CROSS JOIN sample s
GROUP BY h.hour
Without using Subquery Factoring:
CREATE OR REPLACE VIEW my_view AS
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM (SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM (SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t) t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24) h
CROSS JOIN sample s
GROUP BY h.hour
The query accesses the SAMPLES table twice - the first time to get the earliest & latest date to frame the construction of the date_by_hour
value.
This may not be the prettiest or most optimal solution, but it seems to work. Explanation: first convert all the dates to YYYY-MM-DD HH24 format, next gather number sampled/received/completed/authorized by date+HH24, finally join together.
with sample_hour as
(select sample_id,
to_char(sampled_on, 'YYYY-MM-DD HH24') sampled_on,
to_char(received_on, 'YYYY-MM-DD HH24') received_on,
to_char(completed_on, 'YYYY-MM-DD HH24') completed_on,
to_char(authorized_on, 'YYYY-MM-DD HH24') authorized_on
from sample),
s as
(select sampled_on thedate, count(*) num_sampled
from sample_hour
group by sampled_on),
r as
(select received_on thedate, count(*) num_received
from sample_hour
group by received_on),
c as
(select completed_on thedate, count(*) num_completed
from sample_hour
group by completed_on),
a as
(select authorized_on thedate, count(*) num_authorized
from sample_hour
group by authorized_on)
select s.thedate, num_sampled, num_received, num_completed, num_authorized
from s
left join r on s.thedate = r.thedate
left join c on s.thedate = c.thedate
left join a on s.thedate = a.thedate
;
This assumes a table "sample" created something like this:
create table sample
(sample_id number not null primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date);
Here is an example. First create the table and insert some random data.
SQL> create table sample
2 ( sample_id number primary key
3 , sampled_on date
4 , received_on date
5 , completed_on date
6 , authorized_on date
7 )
8 /
Tabel is aangemaakt.
SQL> insert into sample
2 select level
3 , trunc(sysdate) + dbms_random.value(0,2)
4 , trunc(sysdate) + dbms_random.value(0,2)
5 , trunc(sysdate) + dbms_random.value(0,2)
6 , trunc(sysdate) + dbms_random.value(0,2)
7 from dual
8 connect by level <= 1000
9 /
1000 rijen zijn aangemaakt.
Then introduce the variables for your given date range and fill them.
SQL> var DATE_RANGE_START varchar2(10)
SQL> var DATE_RANGE_END varchar2(10)
SQL> exec :DATE_RANGE_START := '2009-10-23'
PL/SQL-procedure is geslaagd.
SQL> exec :DATE_RANGE_END := '2009-10-24'
PL/SQL-procedure is geslaagd.
First you'll have to generate all hours in your given date range. This makes sure that in case you have an hour where no dates are present, you'll still have a record with 4 zeros. The implementation is in the all_hours query. The rest of the query (with only one table access to your sample table!) can then be quite simple like this.
SQL> with all_hours as
2 ( select to_date(:DATE_RANGE_START,'yyyy-mm-dd') + numtodsinterval(level-1,'hour') hour
3 from dual
4 connect by level <=
5 ( to_date(:DATE_RANGE_END,'yyyy-mm-dd')
6 - to_date(:DATE_RANGE_START,'yyyy-mm-dd')
7 + 1
8 ) * 24
9 )
10 select h.hour
11 , count(case when h.hour = trunc(s.sampled_on,'hh24') then 1 end) sampled#
12 , count(case when h.hour = trunc(s.received_on,'hh24') then 1 end) received#
13 , count(case when h.hour = trunc(s.completed_on,'hh24') then 1 end) completed#
14 , count(case when h.hour = trunc(s.authorized_on,'hh24') then 1 end) authorized#
15 from all_hours h
16 cross join sample s
17 group by h.hour
18 /
HOUR SAMPLED# RECEIVED# COMPLETED# AUTHORIZED#
------------------- ---------- ---------- ---------- -----------
23-10-2009 00:00:00 18 25 20 20
23-10-2009 01:00:00 26 24 16 13
23-10-2009 02:00:00 16 26 17 15
23-10-2009 03:00:00 19 18 27 13
23-10-2009 04:00:00 28 20 18 23
23-10-2009 05:00:00 17 13 19 21
23-10-2009 06:00:00 18 23 16 15
23-10-2009 07:00:00 19 24 14 22
23-10-2009 08:00:00 21 19 23 22
23-10-2009 09:00:00 25 20 23 24
23-10-2009 10:00:00 16 21 25 18
23-10-2009 11:00:00 21 29 21 18
23-10-2009 12:00:00 33 28 24 20
23-10-2009 13:00:00 24 19 15 15
23-10-2009 14:00:00 20 27 16 25
23-10-2009 15:00:00 15 25 27 13
23-10-2009 16:00:00 19 14 27 18
23-10-2009 17:00:00 22 22 15 27
23-10-2009 18:00:00 20 19 29 23
23-10-2009 19:00:00 20 18 17 23
23-10-2009 20:00:00 11 18 20 27
23-10-2009 21:00:00 13 25 24 19
23-10-2009 22:00:00 22 13 22 29
23-10-2009 23:00:00 20 20 19 24
24-10-2009 00:00:00 18 17 18 29
24-10-2009 01:00:00 23 30 26 21
24-10-2009 02:00:00 28 19 28 25
24-10-2009 03:00:00 21 21 11 23
24-10-2009 04:00:00 23 20 21 17
24-10-2009 05:00:00 24 16 23 23
24-10-2009 06:00:00 23 26 22 30
24-10-2009 07:00:00 25 26 18 12
24-10-2009 08:00:00 24 20 23 17
24-10-2009 09:00:00 18 26 15 19
24-10-2009 10:00:00 20 19 25 18
24-10-2009 11:00:00 19 27 17 20
24-10-2009 12:00:00 23 16 18 20
24-10-2009 13:00:00 15 15 22 19
24-10-2009 14:00:00 23 23 16 29
24-10-2009 15:00:00 18 31 32 28
24-10-2009 16:00:00 22 15 18 13
24-10-2009 17:00:00 25 17 20 26
24-10-2009 18:00:00 19 20 21 16
24-10-2009 19:00:00 22 13 28 29
24-10-2009 20:00:00 23 17 23 14
24-10-2009 21:00:00 18 18 21 22
24-10-2009 22:00:00 22 20 18 21
24-10-2009 23:00:00 21 18 22 22
48 rijen zijn geselecteerd.
Hope this helps.
Regards, Rob.
I'd do are 4 queries like this (one for each date):
SELECT <date to hour>, count(*) FROM sample GROUP BY <date to hour>
And then put the data together in the application. If you really want a single query, you can join the individual queries on hour
.
Try this...
WITH src_data AS
( SELECT sample_id
, TRUNC( sampled_on, 'HH24' ) sampled_on
, TRUNC( received_on, 'HH24' ) received_on
, TRUNC( completed_on, 'HH24' ) completed_on
, TRUNC( authorized_on, 'HH24' ) authorized_on
FROM sample
)
, src_hours AS
( SELECT sampled_on the_date
FROM src_data
WHERE sampled_on IS NOT NULL
UNION
SELECT received_on the_date
FROM src_data
WHERE received_on IS NOT NULL
UNION
SELECT completed_on the_date
FROM src_data
WHERE completed_on IS NOT NULL
UNION
SELECT authorized_on the_date
FROM src_data
WHERE authorized_on IS NOT NULL
)
SELECT h.the_date
, ( SELECT COUNT(*)
FROM src_data s
WHERE s.sampled_on = h.the_date ) num_sampled_on
, ( SELECT COUNT(*)
FROM src_data r
WHERE r.received_on = h.the_date ) num_received_on
, ( SELECT COUNT(*)
FROM src_data c
WHERE c.completed_on = h.the_date ) num_completed_on
, ( SELECT COUNT(*)
FROM src_data a
WHERE a.authorized_on = h.the_date ) num_authorized_on
FROM src_hours h
Maybe somthing like creating this view:
create view hours as
select hour, max(cnt_sample) cnt_sample, max(cnt_received) cnt_received, max(cnt_completed) cnt_completed, max(cnt_authorized) cnt_authorized
from (
select to_char(sampled_on , 'yyyymmddhh24') hour,
count(sample_id) over (partition by to_char(sampled_on ,'yyyymmddhh24')) cnt_sample,
0 cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(received_on , 'yyyymmddhh24') hour,
0 cnt_sample,
count(sample_id) over (partition by to_char(received_on ,'yyyymmddhh24')) cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(completed_on , 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
count(sample_id) over (partition by to_char(completed_on ,'yyyymmddhh24')) cnt_completed,
0 cnt_authorized from sample union all
select to_char(authorized_on, 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
0 cnt_completed,
count(sample_id) over (partition by to_char(authorized_on ,'yyyymmddhh24')) cnt_authorized from sample
)
group by hour
;
and then selecting from the view:
select * from hours where hour >= '2001010102' and hour <= '2001010105'
order by hour;
I now propose:
create view hours_ as
with four as (
select 1 as n from dual union all
select 2 as n from dual union all
select 3 as n from dual union all
select 4 as n from dual )
select
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end hour_,
sum ( case when four.n = 1 then 1
else 0
end ) sample_,
sum ( case when four.n = 2 then 1
else 0
end ) receive_,
sum ( case when four.n = 3 then 1
else 0
end ) complete_,
sum ( case when four.n = 4 then 1
else 0
end ) authorize_
from
four cross join sample
group by
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end ;
In order to see if the view is indeed accessed only once:
explain plan for select * from hours_
where hour_ between sysdate -1 and sysdate;
select * from table (dbms_xplan.display);
Which results in:
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 61 | 16 (7)| 00:00:01 |
| 1 | VIEW | HOURS_ | 1 | 61 | 16 (7)| 00:00:01 |
| 2 | HASH GROUP BY | | 1 | 39 | 16 (7)| 00:00:01 |
|* 3 | FILTER | | | | | |
| 4 | NESTED LOOPS | | 1 | 39 | 15 (0)| 00:00:01 |
| 5 | VIEW | | 4 | 12 | 8 (0)| 00:00:01 |
| 6 | UNION-ALL | | | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 11 | TABLE ACCESS FULL| SAMPLE | 1 | 36 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Here's what I'm thinking, but I'm not sure it's optimal enough for a view.
select
the_date,
sum(decode(the_type,'S',the_count,0)) samples,
sum(decode(the_type,'R',the_count,0)) receipts,
sum(decode(the_type,'C',the_count,0)) completions,
sum(decode(the_type,'A',the_count,0)) authorizations
from(
select
trunc(sampled_on,'HH24') the_date,
'S' the_type,
count(1) the_count
FROM sample
group by trunc(sampled_on,'HH24')
union all
select
trunc(received_on,'HH24'),
'R',
count(1)
FROM sample
group by trunc(received_on,'HH24')
union all
select
trunc(completed_on,'HH24'),
'C',
count(1)
FROM sample
group by trunc(completed_on,'HH24')
union all
select
trunc(authorized_on,'HH24'),
'A',
count(1)
FROM sample
group by trunc(authorized_on,'HH24')
)
group by the_date
Then, to query, you could just query with normal date contructs:
select * from magic_view where the_date > sysdate-1;
EDIT
Okay, so I created a sample table and did some metrics:
create table sample (
sample_id number primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date
);
insert into sample (
select
level,
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
from dual
connect by level <= 1000
);
The explain plan is:
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4000 | 97K| 25 (20)|
| 1 | HASH GROUP BY | | 4000 | 97K| 25 (20)|
| 2 | VIEW | | 4000 | 97K| 24 (17)|
| 3 | UNION-ALL | | | | |
| 4 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 5 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 6 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 7 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 8 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 9 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 10 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 11 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
---------------------------------------------------------------------
On my machine, the a query against this view for the past 24 hours completes in 23ms. Not bad, but it's only 1,000 rows. Before you discount the 4 separate queries, you'll need to do performance analysis of the individual solutions.
Similar to René Nyffenegger's idea. Filter by each type of date field, and then amalgamate the counts.
Note, that it's not possible to do this query in one Select, because you need to both Group and Order By each date field, this is impossible without splitting into separate sub-queries.
I have coded a date range of '2009-11-04' to '2009-11-04 23:59:59' for this example:
SELECT
DateHour,
SUM(sampled) total_sampled,
SUM(received) total_received,
SUM(completed) total_completed,
SUM(authorized) total_authorized
FROM
(SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
1 sampled,
0 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
sampled_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND sampled_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
1 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
received_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND received_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
1 completed,
0 authorized
FROM
SAMPLE
WHERE
completed_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND completed_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
0 completed,
1 authorized
FROM
SAMPLE
WHERE
authorized_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND authorized_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS'))
GROUP BY
DateHour
ORDER BY
DateHour
精彩评论