Query for historical data
I'm trying to make a query in a postgresql database but I can't figure out how the way to make it. I have a historical table which stores the status of an object, the date and other data. Something like this:
id objectid date status ....
----------------------------
9 12 date1 2
8 12 date2 2
7 12 date3 2 <-- This is the date where the last status was set
6 12 date4 1
5 12 date5 1
4 12 date6 6
3 12 date7 6
2 12 date8 2
1 12 date9 2
I need to get the date where the current status (the last one set for each object) has been set for all the objects in the system (objectid). So in the example (I have only included info for the object 12 to symplify) if they are ordered chronologically (date9 is the oldest and date1 is the earliest) the current status is 2 and I want to get date3 which is hen this status was set for the last time. Notice that status 2 was set earlier, but then it changed to 6, 1 and then to 2 again. I want to get the first date in the current interval.
Can anyone tell me how to construct this query or the way to go?
Thanks.
UPDATE query accoring to @Unreason answer so it could be joined to the table which contains the object which objectid references to
SELECT objectid,开发者_运维知识库 max(date)
FROM (
SELECT objectid, status, date, lag(status) OVER window1, last_value(status) OVER window1 last_status
FROM historical
WINDOW window1 AS ( PARTITION BY objectid ORDER BY date)
) x
WHERE (status <> lag OR lag IS NULL)
AND status = last_status
GROUP BY objectid
There are many ways to do this, windowing functions come to mind
Something like
SELECT max(date)
FROM (
SELECT status, date, lag(status) OVER window1, last_value(status) OVER window1 last_status
FROM historical
WHERE objectid = 12
WINDOW window1 AS ( ORDER BY date)
) x
WHERE (status <> lag OR lag IS NULL)
AND status = last_status;
Notes:
- using keywords as field names (such as lag and date) should be avoided
- there are many other ways to write this query
- currently it works for one object (
objectid = 12
), but it could be modified to return the date of last status for each object
EDIT
Test data
CREATE TABLE historical (
id integer,
objectid integer,
date date,
status integer
);
INSERT INTO historical VALUES (1, 12, '2000-01-01', 2);
INSERT INTO historical VALUES (2, 12, '2001-01-01', 2);
INSERT INTO historical VALUES (3, 12, '2002-01-01', 6);
INSERT INTO historical VALUES (4, 12, '2003-01-01', 6);
INSERT INTO historical VALUES (5, 12, '2004-01-01', 1);
INSERT INTO historical VALUES (6, 12, '2005-01-01', 1);
INSERT INTO historical VALUES (7, 12, '2006-01-01', 2);
INSERT INTO historical VALUES (8, 12, '2007-01-01', 2);
INSERT INTO historical VALUES (9, 12, '2008-01-01', 2);
In future you might want to post results of your
pg_dump -t table_name --inserts
so it is easier to setup a test case
select min(date)
from historical
where status = 2
and objectid = 12
and date >
(select max(date)
from historical
where status <> 2)
精彩评论