How to find a gap in a time ordered table where a given column does not have a certain value for a specified interval

2022-12-13 13:40 问答作者：

I have a large table (millions of 开发者_StackOverflow中文版rows) where I need to find groups of records based on the presence of a certain column value and where a specified 'timeout' has not occurred. I figure one approach would be to find across the entire table where these 'timeout' gaps have occurred.

Example table:

+----------------+------+
| time           | base |
+----------------+------+
| 1245184797.064 | a    |
| 1245184802.020 | a    |
| 1245184807.103 | b    |
| 1245184812.089 | b    |
| 1245184816.831 | b    |
| 1245184821.856 | a    |
| 1245184821.856 | a    |
| 1245184855.903 | a    |
| 1245184855.903 | b    |
| 1245184858.362 | b    |
| 1245184858.362 | b    |
| 1245184860.360 | a    |
| 1245184860.360 | a    |
| 1245184862.174 | a    |
| 1245184862.174 | b    |
| 1245185001.480 | b    |
| 1245185417.556 | a    |
| 1245185417.844 | a    |
| 1245185419.960 | b    |
| 1245185420.181 | b    |
+----------------+------+

Given this set, how would I quickly find the points in the table where base=a hasn't occurred for a given number of seconds (say 5).

To boil it down, my objective is to find spans of records where base=a HAS occurred consistently without timing out.

I think this will help you:

SELECT * FROM (
    SELECT t1.[time],
           t1.time - (SELECT MAX(time) FROM my_table t2 WHERE t2.time < t1.time and t2.base = 'a') AS timeout
    FROM my_table t1
    WHERE t1.base = 'a') d
WHERE timeout > 5

And don't forget to create index for this query to be more effective:

CREATE INDEX idx_my_table_time_base ON my_table (time, base)

One possibility, if you are using a database that supports windowing/analytic functions is something like this:

select * from (
    select time,
           base,
           time - lag(time) over(partition by base order by time) as interval
    from example) w
where w.interval > 5

This should be able to work from a single scan of a (base,time) index. It works on PostgreSQL 8.4 and I think should also work on SQL Server 2008 and Oracle 10.

One way to approach this is to check for "stretch heads", that is, occurrences of a base with more than 5 seconds since it's last occurrence. This example query joins the table on itself to filter out non-heads:

select    head.* 
from      @t head
left join @t nohead 
on        head.base = nohead.base 
and       head.time - 5 < nohead.time and nohead.time < head.time
where     nohead.base is null
order by  head.[time]

For each row, the left join searches for the same base within the last 5 seconds. The where nohead.base is null clause says such a row may not exist. The effect is a list of when a 5+ second span without a base happens.

It won't list the last gap: you could explicitly add "end time" rows for each base:

<end time>     a
<end time>     b
...

to make the query check end-gaps.

继续阅读：postgresql sql sql-server

How to find a gap in a time ordered table where a given column does not have a certain value for a specified interval

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？