Selecting every Nth row per user in Postgres
I was using this SQL statement:
SELECT "dateId", "userId", "Salary"开发者_JAVA技巧
FROM (
SELECT *,
(row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn
FROM user_table
) sa
WHERE sa.rn=1
AND "userId" = 789
AND "Salary" > 0;
But every time the table gets new rows the result of the query is different.
Am I missing something?Assuming that ("dateId", "userId")
is unique and new rows always have a bigger (later) dateId
.
After some comments:
What I think you need:
SELECT "dateId", "userId", "Salary"
FROM (
SELECT "dateId", "userId", "Salary"
,(row_number() OVER (PARTITION BY "userId" -- either this
ORDER BY "dateId")) % 2 AS rn
FROM user_table
WHERE "userId" = 789 -- ... or that
) sub
WHERE sub.rn = 1
AND "Salary" > 0;
Notice the PARTITION BY
. This way you skip every second dateId
for each userId
, and additional (later) rows don't change the selection so far.
Also, as long as you are selecting rows for a single userId
(WHERE "userId" = 789
), pull the predicate into the subquery, achieving the same effect (stable selection for a single user). You don't need both.
The WHERE
clause in the subquery only works for a single user, PARTITION BY
works for any number of users in one query.
Is that it? Is it?
They should give me "detective" badge for this.
Seriously.
No that seems to be OK. You have new rows, those rows change the old rows to appear on different position after sorting.
If someone insert a new row with a userId below 789 the order will change. For example, if you have:
userId rn
1 1
4 0
5 1
6 0
and you insert a row with userId = 2, the rn will change:
userId rn
1 1
2 0
4 1
5 0
6 1
In order to select every Nth row you need a column with a sequence or a timestamp.
精彩评论