Rails 3 app with PostgreSQL - Getting the list of messages grouped by converstation
I am running a rails 3 app that models email messages. The app is deployed on on Heroku so the backend db is PostgreSQL. Messages threads are simple modeled by the thread_id field in the Posts table. When the user post a new Post, let's call it p1 then p1.thread_id=p1.id. If the user reply to p1 with p2 then p2.thread_id=p1.thread_id.
I need to write a query to select the开发者_如何学编程 messages sent to a certain user. The resulting list must contain only one message per thread and that message must be the latest message in the thread. I also need to know how many messages are in each thread.
Using the following select:
SELECT DISTINCT ON(thread_id) * FROM "posts"
does not work as it does not returns the post sorted by last first.
This one does not work either:
SELECT DISTINCT ON(thread_id) * FROM "posts" ORDER BY thread_id, posts.created_at DESC
as the posts are ordered first by thread_id.
Posts table:
create_table "posts", :force => true do |t|
t.string "title"
t.text "content"
t.string "content_type"
t.text "options"
t.string "type"
t.integer "receiver_id"
t.integer "sender_id"
t.integer "circle_id"
t.text "data_bag"
t.datetime "created_at"
t.datetime "updated_at"
t.integer "parent_id"
t.integer "thread_id"
t.datetime "posted_at"
end
Thanks for any help.
If you don't mind getting your hands dirty with a bit of SQL you can use a window function to get the job done. You can get the post IDs with this SQL:
select id
from (
select id,
rank() over (partition by thread_id order by created_at desc)
from posts
where receiver_id = #{user.id}
) as dt
where rank = 1
If you want more columns add them to both SELECT clauses. The #{user.id}
is, of course, the recipient that you're interested in.
The interesting part is the window function:
rank() over (partition by thread_id order by created_at desc)
This will partition the table into groups based on thread_id
(sort of a localized GROUP BY), order them by the timestamp (most-recent first), and then rank()
yields 1 for the first entry in each group, 2 for the second, etc.
Given a table that looks like this:
=> select * from posts;
id | receiver_id | thread_id | created_at
----+-------------+-----------+---------------------
1 | 1 | 2 | 2011-01-01 00:00:00
2 | 1 | 2 | 2011-02-01 00:00:00
3 | 1 | 2 | 2011-03-01 00:00:00
4 | 1 | 3 | 2011-01-01 00:00:00
5 | 1 | 4 | 2011-01-01 00:00:00
6 | 1 | 3 | 2011-01-01 13:00:00
7 | 2 | 11 | 2011-06-06 11:23:42
(7 rows)
The inner query gives you this:
=> select id, rank() over (partition by thread_id order by created_at desc)
from posts
where receiver_id = 1;
id | rank
----+------
3 | 1
2 | 2
1 | 3
6 | 1
4 | 2
5 | 1
(6 rows)
And then we wrap the outer query around that to peel off just the top ranking matches:
=> select id
from (
select id,
rank() over (partition by thread_id order by created_at desc)
from posts
where receiver_id = 1
) as dt
where rank = 1;
id
----
3
6
5
(3 rows)
So add the extra columns you want and wrap it all up in a Post.find_by_sql
and you're done.
精彩评论