Find Consecutive Rows & Calculate Duration
I have a set a of data that tells me if a couple of systems are available or not every 5 or 15 minutes increments. For now, the time increment shouldn't matter.
The data looks like this:
Status Time System_ID
T 10:00 S01
T 10:15 S01
F 10:30 S01
F 10:45 S01
F 11:00 S01
T 11:15 S01
T 11:30 S01
F 11:45 S01
F 12:00 S01
F 12:15 S01
T 12:30 S01
F 10:00 S02
F 10:15 S02
F 10:30 S02
F 10:45 S02
F 11:00 S02
T 11:15 S02
T 11:30 S02
I want to create a view that tells when a system is NOT available (i.e. when it is F), from what time, to what time, and duration which is to - from.
Desired results:
System_ID From To Duration
S01 10:30 11:00 00:30
S01 11:45 12:15 00:30
S02 10:00 11:00 01:00
Here is the script data:
DROP SCHEMA IF EXISTS Sys_data CASCADE;
CREATE SCHEMA Sys_data;
CREATE TABLE test_data (
status BOOLEAN,
dTime TIME,
sys_ID VARCHAR(10),
PRIMARY KEY (dTime, sys_ID)
);
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:0开发者_如何学运维0', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:45:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:00:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:15:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '12:30:00', 'S01');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:00:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:15:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S02');
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S02');
Thank you in advance!
Maybe not optimal, but it works :)
select sys_id, first_time as down_from, max(dTime) as down_to
from ( select status, sys_id, dTime,
(select min(td_add2.dTime)
from test_data td_add2
where td_add2.dtime <= x.dTime
and td_add2.dtime >= COALESCE(x.prev_time,x.min_time)
and td_add2.status = x.status
and td_add2.sys_id = x.sys_id ) as first_time
from ( select td_main.status, td_main.sys_id, td_main.dTime,
(select max(td_add.dTime)
from test_data td_add
where td_add.dtime < td_main.dTime
and td_add.status != td_main.status
and td_add.sys_id = td_main.sys_id ) as prev_time,
(select min(td_add.dTime)
from test_data td_add
where td_add.dtime < td_main.dTime
and td_add.sys_id = td_main.sys_id ) as min_time
from test_data td_main) x
) y
where status = false
and first_time is not null
group by sys_id, first_time
order by sys_id, first_time
+--------+-----------+----------+
| sys_id | down_from | down_to |
+--------+-----------+----------+
| S01 | 10:30:00 | 11:00:00 |
| S01 | 11:45:00 | 12:15:00 |
| S02 | 10:00:00 | 11:00:00 |
+--------+-----------+----------+
3 rows in set (0.00 sec)
Here is cursor based solution, I dont know if MySQL supports the Partition By thus the reason for a cursor. This has been tested in SQL 2008 and it works, hope it works in MySQL but at least it will give you an idea
CREATE TABLE #offline_data
(
dTime DATETIME
,sys_ID VARCHAR(50)
,GroupID INTEGER
)
DECLARE @status BIT
DECLARE @dTime DATETIME
DECLARE @sys_ID VARCHAR(50)
DECLARE @GroupID INTEGER = 0
DECLARE test_cur CURSOR
FOR SELECT
[status]
,[dTime]
,[sys_ID]
FROM
[dbo].[test_data]
OPEN test_cur
FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID
WHILE @@FETCH_STATUS = 0
BEGIN
IF @status = 0
INSERT [#offline_data]
( [dTime] , [sys_ID] , [GroupID] )
VALUES
( @dTime , @sys_ID , @GroupID )
ELSE
SET @GroupID += 1
FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID
END
CLOSE test_cur
DEALLOCATE test_cur
SELECT
[sys_ID] 'SYSTEM_ID'
,CONVERT(VARCHAR(8) , MIN([dTime]) , 108) 'FROM'
,CONVERT(VARCHAR(8) , MAX([dTime]) , 108) 'TO'
,CONVERT(VARCHAR(8) , DATEADD(mi , DATEDIFF(mi , MIN([dTime]) , MAX([dTime])) , '1900-01-01T00:00:00.000') , 108) 'DURATION'
FROM
#offline_data
GROUP BY
[sys_ID]
,[GroupID]
Little bit longer, however seems to work in PostgreSQL. Basic principle:
- find times where system status changes
- get only first and last time - where last status was different and next status is going to be different (or none at all)
- compute difference
Here is the code:
SELECT sys_id,
status,
coalesce(end_time, end_time2) - start_time duration
FROM (
SELECT sys_id, status, start_time, end_time,
lead(end_time) over (partition by sys_id order by dtime) end_time2
FROM (
SELECT sys_id, status, dtime, start_time, end_time
FROM (
SELECT sys_id, status, dtime,
CASE WHEN last_status != status OR last_status IS NULL THEN dtime ELSE NULL END start_time,
CASE WHEN next_status != status OR next_status IS NULL THEN dtime ELSE NULL END end_time
FROM (
SELECT sys_id, status, dtime,
LAG(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) last_status,
LEAD(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) next_status
FROM test_data
ORDER BY sys_id, dtime
) surrounding_status
) last_next_times
WHERE start_time IS NOT NULL OR end_time IS NOT NULL
ORDER BY sys_id, dtime
) start_end_times
) find_last_time
WHERE start_time IS NOT NULL AND status = FALSE
ORDER BY sys_id, start_time;
It is just quick code, there might be simpler solution I think.
精彩评论