开发者

JDO on GoogleAppEngine: How to count and group with BigTable

I need to collect some statistics on my entities in the datastore.

As an example, I need to know how many objects of a kind I have, how many objects with some properties setted to particular values, etc. In usual relational DBMS I may use

    SELECT COUNT(*) ... WHERE property=<some value>

or

    SELECT MAX(*), 开发者_StackOverflow社区... GROUP BY property

etc. But here I cannot see any of these structures.

Moreover, I cannot take load all the objects in memory (e.g. using pm.getExtent(MyCall.class, false)) as I have too much entities (more than 100k).

Do you know any trick to achieve my goal?


Actually it depends on your specific requirements.

Btw, there is a common way, to prepare this stats data in background.

For example, you can run few tasks, by using Queue service, that will use query like select x where x.property == some value + cursor + an sum variable. If you at the first step, cursor will be empty and sum will be zero. Then, you'll iterate your query result, for 1000 items (query limit) or 9 minutes (task limit), incrementing sum on every step, and then, if it's not finished, call this task with new cursor and sum values. I mean you add request to next step into queue. Cursor is easily serializable into string.

When you have final step - you have to save result value somewhere into stat results table.

Take a look at:

  • task queues - http://code.google.com/intl/en/appengine/docs/java/taskqueue/
  • cursor - http://code.google.com/intl/en/appengine/docs/java/datastore/queries.html#Query_Cursors

And also, this stats/aggregation stuff is really depends on your actual task/requirements/project, there few way to accomplish this, optimal for different tasks. There is no standard way, like in SQL


Support for aggregate functions is limited on GAE. This is primarily an artifact of the schema-less nature of BigTable. The alternative is to maintain the aggregate functions as separate fields yourself to access them quickly.

To do a count, you could do something like this --

Query q = em.createQuery("SELECT count(p) FROM your.package.Class p");
Integer i = (Integer) q.getSingleResult(); 

but this will probably return you just 1000 rows since GAE limits the number of rows fetched to 1000.

Some helpful reading how to work around these issues --

http://marceloverdijk.blogspot.com/2009/06/google-app-engine-datastore-doubts.html

Is there a way to do aggregate functions on Google App Engine?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜