Rails - Efficiently pull and calculate data across several relationships
Trying to get data in the most efficient way possible for some reports, using Rails 2.3 and MySQL.
Our app has Users, and Deals, and PurchasedDeals. Relationships look like this:
class User
has_many :purchased_deals
has_many :deals, :through => :purchased_deals
end
class Deal
has_many :purchased_deals
has_many :users, :through => :purchased_deals
end
class Purc开发者_StackOverflow社区hasedDeal
belongs_to :deal
belongs_to :user
end
For the report I'm running, I need to get all users that have made a purchase (i.e. have at least one PurchasedDeal), and then the sum total of all the deals they have bought (price is attached to the Deal, not the PurchasedDeal).
Certainly I could start with a list of all users, including both deals and purchased deals. I've tried that, and the query is massive (30,000 users, give or take, 3,000 deals, 100,000+ purchased deals).
I could start with users, then do a .each and find the ones that have a purchased deal, split them out into their own group, and then iterate over each of those to get the total purchased amount, but that is a fair amount of queries.
Currently, both of these methods take so long that the requests are timing out. What would the most efficient way be to get the data I need? Adding columns to tables is a totally acceptable solution, btw. I have full database access to do what I need.
Thanks!
To get a list of user IDs with more than one purchase, you can do the following, which will access just one table:
user_ids = PurchasedDeal.count(:group => :user_id, :having => 'count_all > 0').keys
Subsequently, you can fetch all these users with:
users = User.find user_ids
Things can be sped up with a counter cache. In your user model, add the option :counter_cache => true
to the has_many
association for purchased deals. You'll need an extra integer column on your users table and initialize, which might look as follows in a migration:
add_column :users, :purchased_deals_count, :integer, :null => false, :default => 0
User.each { |u| User.reset_counters u, :purchased_deals }
Once that's out of the way, it becomes a lot simpler:
users = User.all :conditions => 'purchased_deals_count > 0'
Rails will keep the column up-to-date for you, with most standard operations.
To get the total price will always involve a join. Or you can build a hash of deal prices and do the tedious processing in Ruby. I'm no SQL expert, but you can potentially get rid of the join by storing the price with the PurchasedDeal. Otherwise, here's how to do it with a join:
user_id_to_price = PurchasedDeal.sum 'deal.price', :include => :deal, :group => :user_id
You could filter that on just the users you want by adding something like :conditions => ['user_id IN (?)', users]
. (Where users
can be a list of IDs, but also User objects.)
Assuming you add price column to purchased_deals table, you could get information of users and total deals price like this:
select users.id, sum(purchased_deals.price) from users, purchased_deals where users.id = purchased_deals.user_id group by users.id having sum(purchased_deals.price) > 0
精彩评论