开发者

Query for historical data

I'm trying to make a query in a postgresql database but I can't figure out how the way to make it. I have a historical table which stores the status of an object, the date and other data. Something like this:

id objectid    date    status    ....
----------------------------
9  12          date1   2
8  12          date2   2
7  12          date3   2    <-- This is the date where the last status was set
6  12          date4   1
5  12          date5   1
4  12          date6   6
3  12          date7   6
2  12          date8   2
1  12          date9   2

I need to get the date where the current status (the last one set for each object) has been set for all the objects in the system (objectid). So in the example (I have only included info for the object 12 to symplify) if they are ordered chronologically (date9 is the oldest and date1 is the earliest) the current status is 2 and I want to get date3 which is hen this status was set for the last time. Notice that status 2 was set earlier, but then it changed to 6, 1 and then to 2 again. I want to get the first date in the current interval.

Can anyone tell me how to construct this query or the way to go?

Thanks.


UPDATE query accoring to @Unreason answer so it could be joined to the table which contains the object which objectid references to

SELECT objectid,开发者_运维知识库 max(date)
FROM (
    SELECT objectid, status, date, lag(status) OVER window1, last_value(status) OVER window1 last_status
    FROM historical
    WINDOW window1 AS ( PARTITION BY objectid ORDER BY date) 
    ) x 
WHERE (status <> lag OR lag IS NULL)
AND status = last_status
GROUP BY objectid


There are many ways to do this, windowing functions come to mind

Something like

SELECT max(date)
FROM (
      SELECT status, date, lag(status) OVER window1, last_value(status) OVER window1 last_status
      FROM historical
      WHERE objectid = 12
      WINDOW window1 AS ( ORDER BY date) 
      ) x 
WHERE (status <> lag OR lag IS NULL)
      AND status = last_status;

Notes:

  • using keywords as field names (such as lag and date) should be avoided
  • there are many other ways to write this query
  • currently it works for one object (objectid = 12), but it could be modified to return the date of last status for each object

EDIT
Test data

CREATE TABLE historical (
    id integer,
    objectid integer,
    date date,
    status integer
);

INSERT INTO historical VALUES (1, 12, '2000-01-01', 2);
INSERT INTO historical VALUES (2, 12, '2001-01-01', 2);
INSERT INTO historical VALUES (3, 12, '2002-01-01', 6);
INSERT INTO historical VALUES (4, 12, '2003-01-01', 6);
INSERT INTO historical VALUES (5, 12, '2004-01-01', 1);
INSERT INTO historical VALUES (6, 12, '2005-01-01', 1);
INSERT INTO historical VALUES (7, 12, '2006-01-01', 2);
INSERT INTO historical VALUES (8, 12, '2007-01-01', 2);
INSERT INTO historical VALUES (9, 12, '2008-01-01', 2);

In future you might want to post results of your

pg_dump -t table_name --inserts

so it is easier to setup a test case


select  min(date)
from    historical
where   status        =   2 
and     objectid      =   12
and     date > 
  (select max(date) 
    from  historical
    where status <> 2)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜