开发者

Advice on a complex SQL query for a BIRT dataset

I have the following (simplified) PostgreSQL database table containing info about maintenance done on a certain device:

id bigint NOT NULL,
"time" timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text,

The action_name field can have four values of interest:

MAINTENANCE_START
DEVICE_DEFECT
DEVICE_REPAIRED
MAINTENANCE_STOP
<other (irrelevant) values>

I have to do a BIRT report using the information from this table. I should have an entry in a table each time a MAINTENANCE_STOP action is encountered. If between this MAINTENANCE_STOP action and the its corresponding MAINTENANCE_START action (should be the MAINTENANCE_START action with the max "time" value smaller than that of the MAINTENANCE_STOP action) I encounter a DEVICE_DEFECT or DEVICE_REPAIRED action I should write in a table cell the string "Device not available", else I should write "Device available".

Also, I should compute the duration of the maintenance as the time difference between the MAINTENANCE_STOP action and the MAINTE开发者_如何学运维NANCE_START action.

I first attempted to do this in the SQL query, but now I'm not sure it's possible. What approach do you recommend?


My working snippet:

CREATE TABLE "log"
(
  id bigint NOT NULL,
  time timestamp(0) with time zone,
  action_name text NOT NULL,
  action_info text NOT NULL DEFAULT ''::text
);

insert into log(id,time,action_name,action_info) values ( 1, '2011-01-01', 'MAINTENANCE_START', 'maintenance01start');
insert into log(id,time,action_name,action_info) values ( 2, '2011-02-01', 'MAINTENANCE_START', 'maintenance02start');
insert into log(id,time,action_name,action_info) values ( 3, '2011-03-01', 'MAINTENANCE_START', 'maintenance03start');
insert into log(id,time,action_name,action_info) values ( 4, '2011-04-01', 'MAINTENANCE_START', 'maintenance04start');
insert into log(id,time,action_name,action_info) values ( 5, '2011-01-10', 'MAINTENANCE_STOP', 'maintenance01stop');
insert into log(id,time,action_name,action_info) values ( 6, '2011-02-10', 'MAINTENANCE_STOP', 'maintenance02stop');
insert into log(id,time,action_name,action_info) values ( 7, '2011-03-10', 'MAINTENANCE_STOP', 'maintenance03stop');
--insert into log(id,time,action_name,action_info) values ( 8, '2011-04-10', 'MAINTENANCE_STOP', 'maintenance04stop');
insert into log(id,time,action_name,action_info) values ( 9, '2011-02-05', 'DEVICE_DEFECT', 'maintenance02defect');
insert into log(id,time,action_name,action_info) values ( 10, '2011-03-05', 'DEVICE_REPAIRED', 'maintenance03repaired');

select 
  maintenance.start as start
, maintenance.stop as stop
, count (device_action.*) as device_actions
from (select 
  l_start.time as start
  , (select time 
      from log l_stop 
      where l_stop.time > l_start.time 
      and l_stop.action_name = 'MAINTENANCE_STOP'
      order by time asc limit 1) as stop
  from log l_start
  where l_start.action_name='MAINTENANCE_START' order by l_start.time asc) maintenance
left join log device_action
  on device_action.time > maintenance.start
  and device_action.time < maintenance.stop
  and device_action.action_name like 'DEVICE_%'
group by maintenance.start
  , maintenance.stop
order by maintenance.start asc
;

Be carefull with performance. If Postgres didn't optimize nested query, it would take O(n^2) time.

If you may:

  1. Change structure. E.g. one table DEVICE_MAINTENANCES with maintenance ID and second table DEVICE_MAINTENANCE_ACTIONS with foreign key DEVICE_MAINTENANCES.ID. Queries will be simpler and faster.
  2. If not, treat time as primary key (implict index)
  3. If not, create index on time column.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜