How do I get Rails to eager load counts?
This is related to a question a year and change ago.
I put up an example of the question that should work out of the box, provided you have sqlite3 available: https://github.com/cairo140/rails-eager-loading-counts-demo
Installation instructions (for the main branch)
git clone git://github.com/cairo140/rails-eager-loading-counts-demo.git
cd rails-eager-loading-counts-demo
rails s
I have a fuller write-up in the repository, but my general question is this.
How can I make Rails eager load counts in a way that minimizes db queries across the board?
The n+1
problem emerges whenever you use #count
on an association, despite having included that association via #includes(:associated)
in the ActiveRelation. A workaround is to use #length
, but this works well only when the object it's being called on has already been loaded up, not to mention that I suspect it duplicates something that the Rails internals have done already. Also, an issue with using #length
is that it results in an unfortunate over-loading when the association was not loaded to begin with and the count is all you need.
From the readme:
We can dodge this issue by running #length on the posts array (see appendix), which is already loaded, but it would be nice to have count readily available as well. Not only is it more consistent; it provides a path of access that doesn't necessarily require posts to be loaded. For instance, if you have a partial that displays the count no matter what, but half the time,开发者_如何学Go the partial is called with posts loaded and half the time without, you are faced with the following scenario:
- Using
#count
- n
COUNT
style queries when posts are already loaded- n
COUNT
style queries when posts are not already loaded- Using
#length
- Zero additional queries when posts are already loaded
- n
*
style queries when posts are not already loadedBetween these two choices, there is no dominant option. But it would be nice to revise #count to defer to #length or access the length that is some other way stored behind the scenes so that we can have the following scenario:
- Using revised
#count
- Zero additional queries when posts are already loaded
- n
COUNT
style queries when posts are not already loaded
So what's the correct approach here? Is there something I've overlooked (very, very likely)?
As @apneadiving suggested, counter_cache works well because the counter column gets automatically updated when records are added or removed. So when you load the parent object, the count is included in the object without needing to access the other table.
However, if for whatever reason you don't like that approach, you could do this:
Post.find(:all,
:select => "posts.*, count(comments.id) `comments_count`",
:joins => "left join comments on comments.post_id = posts.id")
An alternative approach to the one of Zubin:
Post.select('posts.*, count(comments.id) `comments_count`').joins(:comments).group('posts.id')
It appears that the best way to implement this sort of facility might be to create SQL Views (ref: here and here) for the seperate model-and-child-count objects that you want; and their associated ActiveRecord models.
You might be able to be very clever and use subclassing on the original model combined with set_table_name :sql_view_name
to retain all the original methods on the objects, and maybe even some of their associations.
For instance, say we were to add 'Post.has_many :comments' to your example, like in @Zubin's answer above; then one might be able to do:
class CreatePostsWithCommentsCountsView < ActiveRecord::Migration
def self.up
#Create SQL View called posts_with_comments_counts which maps over
# select posts.*, count(comments.id) as comments_count from posts
# left outer join comments on comments.post_id = posts.id
# group by posts.id
# (As zubin pointed out above.)
#*Except* this is in SQL so perhaps we'll be able to do further
# reducing queries against it *as though it were any other table.*
end
end
class PostWithCommentsCount < Post #Here there be cleverness.
#The class definition sets up PWCC
# with all the regular methods of
# Post (pointing to the posts table
# due to Rails' STI facility.)
set_table_name :posts_with_comment_counts #But then we point it to the
# SQL view instead.
#If you don't really care about
# the methods of Post being in PWCC
# then you could just make it a
# normal subclass of AR::Base.
end
PostWithCommentsCount.all(:include => :user) #Obviously, this sort of "upward
# looking" include is best used in big lists like "latest posts" rather than
# "These posts for this user." But hopefully it illustrates the improved
# activerecordiness of this style of solution.
PostWithCommentsCount.all(:include => :comments) #And I'm pretty sure you
# should be able to do this without issue as well. And it _should_ only be
# the two queries.
I have set up a small gem that adds an includes_count
method to ActiveRecord, that uses a SELECT COUNT to fetch the number of records in an association, without resorting to a JOIN which might be expensive (depending on the case).
See https://github.com/manastech/includes-count
Hope it helps!
精彩评论