开发者

Find Consecutive Rows & Calculate Duration

I have a set a of data that tells me if a couple of systems are available or not every 5 or 15 minutes increments. For now, the time increment shouldn't matter.

The data looks like this:

Status     Time         System_ID
T          10:00          S01
T          10:15          S01
F          10:30          S01
F          10:45          S01
F          11:00          S01
T          11:15          S01
T          11:30          S01
F          11:45          S01
F          12:00          S01
F          12:15          S01
T          12:30          S01

F          10:00          S02
F          10:15          S02
F          10:30          S02
F          10:45          S02
F          11:00          S02
T          11:15          S02
T          11:30          S02

I want to create a view that tells when a system is NOT available (i.e. when it is F), from what time, to what time, and duration which is to - from.

Desired results:

System_ID    From            To           Duration
S01          10:30          11:00          00:30 
S01          11:45          12:15          00:30 
S02          10:00          11:00          01:00 

Here is the script data:

DROP SCHEMA IF EXISTS Sys_data CASCADE;
CREATE SCHEMA Sys_data;

CREATE TABLE test_data (
          status BOOLEAN,
          dTime TIME,
          sys_ID VARCHAR(10),
          PRIMARY KEY (dTime, sys_ID)
);

INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:0开发者_如何学运维0', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:45:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '12:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:00:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:15:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S02');

Thank you in advance!


Maybe not optimal, but it works :)

select sys_id, first_time as down_from, max(dTime) as down_to
from ( select status, sys_id, dTime,
            (select min(td_add2.dTime) 
                from test_data td_add2
               where td_add2.dtime <=  x.dTime
                 and td_add2.dtime >=  COALESCE(x.prev_time,x.min_time)
                 and td_add2.status = x.status       
                 and td_add2.sys_id = x.sys_id ) as first_time
         from ( select td_main.status, td_main.sys_id, td_main.dTime,       
                                (select max(td_add.dTime) 
                                   from test_data td_add 
                                  where td_add.dtime <  td_main.dTime
                                    and  td_add.status != td_main.status       
                                    and td_add.sys_id = td_main.sys_id ) as prev_time,
                                (select min(td_add.dTime) 
                                   from test_data td_add 
                                  where td_add.dtime <  td_main.dTime
                                    and td_add.sys_id = td_main.sys_id ) as min_time                                                                                                    
                from test_data td_main) x
      ) y
where status = false 
and first_time is not null 
group by sys_id, first_time
order by sys_id, first_time
+--------+-----------+----------+
| sys_id | down_from | down_to  |
+--------+-----------+----------+
| S01    | 10:30:00  | 11:00:00 |
| S01    | 11:45:00  | 12:15:00 |
| S02    | 10:00:00  | 11:00:00 |
+--------+-----------+----------+
3 rows in set (0.00 sec)


Here is cursor based solution, I dont know if MySQL supports the Partition By thus the reason for a cursor. This has been tested in SQL 2008 and it works, hope it works in MySQL but at least it will give you an idea

CREATE TABLE #offline_data
    (
     dTime DATETIME
    ,sys_ID VARCHAR(50)
    ,GroupID INTEGER
    )


DECLARE @status BIT
DECLARE @dTime DATETIME 
DECLARE @sys_ID VARCHAR(50)

DECLARE @GroupID INTEGER = 0


DECLARE test_cur CURSOR 
FOR SELECT
[status]
,[dTime]
,[sys_ID]
FROM
[dbo].[test_data]

OPEN test_cur
FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID

WHILE @@FETCH_STATUS = 0 
    BEGIN

        IF @status = 0 
            INSERT  [#offline_data]
                    ( [dTime] , [sys_ID] , [GroupID] )
            VALUES
                    ( @dTime , @sys_ID , @GroupID )
        ELSE 
            SET @GroupID += 1

        FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID
    END

CLOSE test_cur
DEALLOCATE test_cur

SELECT
    [sys_ID] 'SYSTEM_ID'
   ,CONVERT(VARCHAR(8) , MIN([dTime]) , 108) 'FROM'
   ,CONVERT(VARCHAR(8) , MAX([dTime]) , 108) 'TO'
   ,CONVERT(VARCHAR(8) , DATEADD(mi , DATEDIFF(mi , MIN([dTime]) , MAX([dTime])) , '1900-01-01T00:00:00.000') , 108) 'DURATION'
FROM
    #offline_data
GROUP BY
    [sys_ID]
   ,[GroupID]


Little bit longer, however seems to work in PostgreSQL. Basic principle:

  1. find times where system status changes
  2. get only first and last time - where last status was different and next status is going to be different (or none at all)
  3. compute difference

Here is the code:

SELECT sys_id,
    status,
    coalesce(end_time, end_time2) - start_time duration
FROM (
SELECT sys_id, status, start_time, end_time,
lead(end_time) over (partition by sys_id order by dtime) end_time2
FROM (  
    SELECT sys_id, status, dtime, start_time, end_time
    FROM (
        SELECT sys_id, status, dtime, 
        CASE WHEN last_status != status OR last_status IS NULL THEN dtime ELSE NULL END start_time,
        CASE WHEN next_status != status OR next_status IS NULL THEN dtime ELSE NULL END end_time
        FROM (
        SELECT sys_id, status, dtime,
            LAG(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) last_status,
            LEAD(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) next_status
            FROM test_data 
            ORDER BY sys_id, dtime
        ) surrounding_status
    ) last_next_times

    WHERE start_time IS NOT NULL OR end_time IS NOT NULL
    ORDER BY sys_id, dtime
) start_end_times
) find_last_time
WHERE start_time IS NOT NULL AND status = FALSE
ORDER BY sys_id, start_time;

It is just quick code, there might be simpler solution I think.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜