Rails 3 -- get latest record with a distinct, unknown value
I have a model in the database. The model has 3 keys of interest:
created_at, type, key
- created_at, is, well, the timestamp
- type is a known enumeration.
- key follows a convention, but in general is unknown and non-unique.
There are potentially hundreds of records with a given type and key combination, and the records are potentially not small. I'm trying to avoid loading them from the database as much as possible.
The problem is to efficiently find the latest version of an object (without having to delete old versions). I want to get the latest record from the database for each key of a type, but I do not know what the keys are. The query is that I give a type, I end up with a Hash of objects [key => object] where the object I select for the hash is the newest object (most recent created_at value) with that key-type pair.
My first thought was to do this in memory
# this is pseudo code, have not compiled
models = Model.where(:type => :some_type).order("created_at desc")
result = models.inject(Hash.new) {|r, m| r[m.key] = m unless r.has_key? m.key}
But this is going to get ugly big as I scale. Second thought is to get all the keys then query for all the models. Something like:
keys = Model.where(:type => :some_t开发者_StackOverflow中文版ype).select("DISTINCT key").map{|m| m.key }
result = keys.inject(Hash.new) {|r, k| r[k] = Model.where(:type => :some_type).where(:key => k).order("created_at").last; r }
But, as I'm writing this code, I just keep thinking, there's gotta be a better way. This solution would case me to potentially do a lot of queries to the database as things grow. At some point, I'll have to cap the keys anyway, so if you can suggest a solution that lets me limit/paginate the results, better still.
So, is there a way to do this more efficiently? Maybe a magic search param in Arel or keyword in SQL I've forgotten about?
I would use a separate table that records the ID of the latest record in a separate table, ie.
class Model
after_create :update_latest_record
def update_latest_record
if latest_model = LatestModelLookup.where(:type => self.type, :key => self.key)
latest_model.update_attributes(:model_id => self.id)
else
LatestModelLookup.create(:type => self.type, :key => self.key, :model_id => self.id)
end
end
end
You'll need to have an index on LatestModelLookup(type, key)
(and probably LatestModelLookup(type)
)
So when you need to query the latest records by key for a type, you would have to do:
model_ids = LatestModelLookup.where(:type => type).select('model_id').map(&:model_id)
result = Model.find(model_ids).inject({}) { |res, rec| res[rec.key] = rec }
The benefit of having a separate table is that the overhead of updating the indices on LatestModelLookup
are pretty low. The indices will only change when a new record for a distinct [type, key]
are added.
edit: conditional was reversed
精彩评论