开发者

PostgreSQL DISTINCT problem: works locally but not on server

I've come across a vexing problem with a PostgreSQL query. This works in my local development environment:

SELECT distinct (user_id) user_id, created_at, is_goodday 
FROM table 
WHERE ((created_at >= '2011-07-01 00:00:00') AND user_id = 95 
AND (created_at < '2011-08-01 00:00:00')) 
ORDER BY user_id, created_at ASC;

...but gives the following error on my QA server (which is on Heroku):

PGError: ERROR:  syntax error at or near "user_id"
LINE 1: SELECT distinct (user_id) user_id, created_at,
                                  ^ 

Why could this be?

Other possibly relevant info:

I have tried single-quoting and double-quoting the field names

It's a Rails 3 app, but I'm using 开发者_StackOverflow中文版this SQL raw, i.e. no ActiveRecord magic

My local version of Postgres is 9.0.4 on Mac, but I have no idea what version Heroku is using


As per your comment, the standard PostgreSQL version of that query would be:

SELECT user_id, created_at, is_goodday
FROM table
WHERE created_at >= '2011-07-01 00:00:00'
  AND created_at <  '2011-08-01 00:00:00'
  AND user_id     = 95
ORDER BY created_at DESC, id DESC
LIMIT 1

You don't need user_id in the ORDER BY because you have user_id = 95, you want created_at DESC in the ORDER BY to put the most recent created_at at the top; then you LIMIT 1 to slice off just the first row in the result set. GROUP BY can be used to enforce uniqueness or if you need to group things for an aggregate function but you don't need it for either one of those here as you can get uniqueness through ORDER BY and LIMIT and you can hide your aggregation inside the ORDER BY (i.e. you don't need MAX because ORDER BY does that for you).

Since you have user_id = 95 in your WHERE, you don't need user_id in the SELECT but you can leave it in if that makes it easier for you in Ruby-land.

It is possible that you could have multiple entries with the same created_at so I added an id DESC to the ORDER BY to force PostgreSQL to choose the one with the highest id. There's nothing wrong with being paranoid when they really are out to get you and bugs definitely are out to get you.

Also, you want DESC in your ORDER BY to get the highest values at the top, ASC puts the lowest values at the top. The more recent timestamps will be the higher ones.

In general, the GROUP BY and SELECT have to match up because:

When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.

But that doesn't matter here because you don't need a GROUP BY at all. I linked to the 8.3 version of the documentation to match the PostgreSQL version you're using.

There are probably various other ways to do this but this one as probably as straight forward and clear as you're going to get.


put a quote in user_id like user_id = '95'. Your query should be

SELECT distinct (user_id) as uid, created_at, is_goodday  FROM table  WHERE 
((created_at >= '2011-07-01 00:00:00') AND user_id = '95'  AND (created_at < '2011-08-01 00:00:00'))  ORDER BY user_id, created_at ASC;   


You're using DISTINCT ON (without writing the ON). Perhaps you should write the ON. Perhaps your postgres server dates from before the feature was implemented (which is pretty old by now).

If all else fails, you can always do that with some GROUP BY...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜