Time interval calculation in time series using SQL
I have a MySQL table like this
CREATE TABLE IF NOT EXISTS `vals` (
`DT` datetime NOT NULL,
`value` INT(11) NOT NULL,
PRIMARY KEY (`DT`)
);
the DT is unique date with time
data sample:
INSERT INTO `vals` (`DT`,`value`) VALUES
('2011-02-05 06:05:00', 300),
('2011-02-05 11:05:00', 250),
('2011-02-05 14:35:00', 145),
('2011-02-05 16:45:00', 100),
('2011-02-05 18:50:00', 125),
('2011-02-05 19:25:00', 100),
('2011-02-05 21:10:00', 125),
('2011-02-06 00:30:00', 150);
I need to get something like this:
start|end|value
NULL,'2011-02-05 06:05:00',300
'2011-02-05 06:05:00','2011-02-05 11:05:00',250
'2011-02-05 11:05:00','2011-02-05 14:35:00',145
'2011-02-05 14:35:00','2011-02-05 16:45:00',100
'2011-02-05 16:45:00','2011-02-05 18:50:00',125
'2011-02-05 18:50:00','2011-02-05 19:25:00',100
'2011-02-05 19:25:00','2011-02-05 21:10:00',125
'2011-02-05 21:10:00','2011-02-06 00:30:00',150
'2011-02-06 00:30:00',NULL,NULL
I tried the following query:
SELECT T1.DT AS `start`,T2.DT AS `stop`, T2.value AS value FROM (
SELECT DT FROM vals
) T1
LEFT JOIN (
SELECT DT,value FROM vals
) T2
ON T2.DT > T1.DT ORDER BY T1.DT ASC
but it returns to many rows (29 instead of 9) in result and I cold 开发者_StackOverflownot find any way to limit this using SQL. Is it Possible in MySQL?
Use a subquery
SELECT
(
select max(T1.DT)
from vals T1
where T1.DT < T2.DT
) AS `start`,
T2.DT AS `stop`,
T2.value AS value
FROM vals T2
ORDER BY T2.DT ASC
You can also use a MySQL specific solution employing variables
SELECT CAST( @dt AS DATETIME ) AS `start` , @dt := DT AS `stop` , `value`
FROM (SELECT @dt := NULL) dt, vals
ORDER BY dt ASC
But you need to do it precisely
- the ORDER by must be present otherwise the variables don't roll properly
- the variable needs to be NULLified within the query using a subquery to set it, otherwise if you run it twice in a row, the 2nd time it will not start with NULL
You can use a server-side variable to simulate it:
select @myvar as start, end, value, @myvar := end as next_rows_start
from vals
Variables are interpreted from left-right in sequence, so the two references to @myvar (start and next_rows_start) will output with two different values.
Just remember to reset @myvar to null before and/or after the query, otherwise the second and subsequent runs will have a wrong first row:
select @myvar := null
This would be easier if the table had a running ID column which corresponds to the times in DT (same order). If you don't want to change the table you can use a temp:
drop table if exists temp;
CREATE TABLE temp (
`id` INT(11) AUTO_INCREMENT,
`DT` datetime NOT NULL,
`value` INT(11) NOT NULL,
PRIMARY KEY (`id`)
);
insert into temp (DT,value) select * from vals order by DT asc;
select t1.DT as `start`, t2.DT as `end`, t2.value
from temp t2
left join temp t1 ON t2.id = t1.id + 1;
精彩评论