Advice on a complex SQL query for a BIRT dataset
I have the following (simplified) PostgreSQL database table containing info about maintenance done on a certain device:
id bigint NOT NULL,
"time" timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text,
The action_name field can have four values of interest:
MAINTENANCE_START
DEVICE_DEFECT
DEVICE_REPAIRED
MAINTENANCE_STOP
<other (irrelevant) values>
I have to do a BIRT report using the information from this table. I should have an entry in a table each time a MAINTENANCE_STOP action is encountered. If between this MAINTENANCE_STOP action and the its corresponding MAINTENANCE_START action (should be the MAINTENANCE_START action with the max "time" value smaller than that of the MAINTENANCE_STOP action) I encounter a DEVICE_DEFECT or DEVICE_REPAIRED action I should write in a table cell the string "Device not available", else I should write "Device available".
Also, I should compute the duration of the maintenance as the time difference between the MAINTENANCE_STOP action and the MAINTE开发者_如何学运维NANCE_START action.
I first attempted to do this in the SQL query, but now I'm not sure it's possible. What approach do you recommend?
My working snippet:
CREATE TABLE "log"
(
id bigint NOT NULL,
time timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text
);
insert into log(id,time,action_name,action_info) values ( 1, '2011-01-01', 'MAINTENANCE_START', 'maintenance01start');
insert into log(id,time,action_name,action_info) values ( 2, '2011-02-01', 'MAINTENANCE_START', 'maintenance02start');
insert into log(id,time,action_name,action_info) values ( 3, '2011-03-01', 'MAINTENANCE_START', 'maintenance03start');
insert into log(id,time,action_name,action_info) values ( 4, '2011-04-01', 'MAINTENANCE_START', 'maintenance04start');
insert into log(id,time,action_name,action_info) values ( 5, '2011-01-10', 'MAINTENANCE_STOP', 'maintenance01stop');
insert into log(id,time,action_name,action_info) values ( 6, '2011-02-10', 'MAINTENANCE_STOP', 'maintenance02stop');
insert into log(id,time,action_name,action_info) values ( 7, '2011-03-10', 'MAINTENANCE_STOP', 'maintenance03stop');
--insert into log(id,time,action_name,action_info) values ( 8, '2011-04-10', 'MAINTENANCE_STOP', 'maintenance04stop');
insert into log(id,time,action_name,action_info) values ( 9, '2011-02-05', 'DEVICE_DEFECT', 'maintenance02defect');
insert into log(id,time,action_name,action_info) values ( 10, '2011-03-05', 'DEVICE_REPAIRED', 'maintenance03repaired');
select
maintenance.start as start
, maintenance.stop as stop
, count (device_action.*) as device_actions
from (select
l_start.time as start
, (select time
from log l_stop
where l_stop.time > l_start.time
and l_stop.action_name = 'MAINTENANCE_STOP'
order by time asc limit 1) as stop
from log l_start
where l_start.action_name='MAINTENANCE_START' order by l_start.time asc) maintenance
left join log device_action
on device_action.time > maintenance.start
and device_action.time < maintenance.stop
and device_action.action_name like 'DEVICE_%'
group by maintenance.start
, maintenance.stop
order by maintenance.start asc
;
Be carefull with performance. If Postgres didn't optimize nested query, it would take O(n^2) time.
If you may:
- Change structure. E.g. one table DEVICE_MAINTENANCES with maintenance ID and second table DEVICE_MAINTENANCE_ACTIONS with foreign key DEVICE_MAINTENANCES.ID. Queries will be simpler and faster.
- If not, treat
time
as primary key (implict index) - If not, create index on
time
column.
精彩评论