How to aggregate data from one collection into another in efficient way using MongoDB?
I have a problem that i am not sure how to sol开发者_C百科ve efficiently.
I have two collections:
1)
hits = {
'day': '',
'number_of_hits': 0
'user_id': 0
}
2)
stats = {
'day': '',
'total_number_of_hits': 0
'user_id': 0
...
some other stuff
}
I need to get sums of number_of_hits for each day (there can be many documents for each day containing various number of hits) and update stats collections with those sums in quickest way possible. This has to be done for each user_id found in hits collection
I can for example get an aggregates for hits collection and then in eg. loop update stats collection.
But something is telling me it's not a good way.
Also sometimes stats collection may not have documents for some days, so they need to be created instead of updated.
If you can get me any ideas it would be amazing :)
Thank you, PabloX
May be try to re-design you structure as
stats{
'day': '',
'user_id': 0,
'hits':{
// Array of your hits document
}
}
and get only one document with all hits. You can calculate total at any time.
Your problem seems like a classic case for Mongo's map/reduce capabilities. See http://www.mongodb.org/display/DOCS/MapReduce for more details.
One thing to watch out for, though, with map/reduce. On the version of Mongo I'm using (1.4.5), doing a map/reduce acquires a lock on the database that locks out all readers and writers. Not sure if that is still an issue or not in newer versions of Mongo.
Another idea would be an update using $inc operator Basically, if document exists that match user_id and date, then just increase hits by one, otherwise insert one.
This is the most efficient way, unless you really need to record every hit.
I am not sure how this is done with python, but check out the documentation on Mongo:
http://www.mongodb.org/display/DOCS/Updating#Updating-%24inc
精彩评论