开发者

Counts of events grouped by date in python?

This is no doubt another noobish question, but I'll ask it anyways:

I have a data set of events with exact datetime in UTC. I'd like to create a line chart showing total number of events by day (date) in the specified date range. Right now I can retrieve the total data set for the needed date range, but then I need to go through it and count up for each date.

The app is running on google app engine and is using python.

Wh开发者_运维百科at is the best way to create a new data set showing date and corresponding counts (including if there were no events on that date) that I can then use to pass this info to a django template?

Data set for this example looks like this:

class Event(db.Model):
    event_name = db.StringProperty()
    doe = db.DateTimeProperty()
    dlu = db.DateTimeProperty()
    user = db.UserProperty()

Ideally, I want something with date and count for that date.

Thanks and please let me know if something else is needed to answer this question!


You'll have to do the binning in-memory (i.e. after the datastore fetch).

The .date() method of a datetime instance will facilitate your binning; it chops off the time element. Then you can use a dictionary to hold the bins:

bins = {}
for event in Event.all().fetch(1000):
    bins.setdefault(event.doe.date(), []).append( event )

Then do what you wish with (e.g. count) the bins. For a direct count:

counts = collections.defaultdict(int)
for event in Event.all().fetch(1000):
    counts[event.doe.date()] += 1


I can't see how that would be possible with single query as GQL has no support for GROUP BY or aggregation generally.


In order to minimize the amount of work you do, you'll probably want to write a task that sums up the per-day totals once, so you can reuse them. I'd suggest using the bulkupdate library to run a once-a-day task that counts events for the previous day, and creates a new model instance, with a key name based on the date, containing the count. Then, you can get all needed data points by doing a query (or better, a batch get) for the set of summary entities you need.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜