开发者

Google App Engine - getting count of records that match criteria over 1000

I've read in multiple locations that GAE lifted the 1000 record limit on queries and counts, however, I can only seem to get a count of the records up to 1000. I won't be pulling more than 1000 queries at a time, but the requirements are such that I need a count of the matching records.

I understand you can use cursors to "paginate" through the dataset, but to cycle through just to get a count seems a bit much. Presumably when they said开发者_C百科 they "lifted" the limit, it was the hard limit - you still need to cycle through the results 1000 at a time, am I correct?

Should I be using a method other than the .all()/filter method to generate 1000+ counts?

Thanks in advance for all your help!


The behavior of Query.count() is inconsistent with the documentation when no limit is explicitly specified - the documentation indicates that it will count "until it finishes counting or times out." GAE Issue 3671 reported this bug (about 3 weeks ago).

The workaround: explicitly specify a limit and then that value will be used (rather than the default of 1,000).

Testing on http://shell.appspot.com demonstrates this:

# insert 1500 TestModel entites ...
# ...
>>> TestModel.all(keys_only=True).count()
1000L
>>> TestModel.all(keys_only=True).count(10000)
1500L

I also see the same behavior on the latest version of the development server (1.3.7) using this simple test app:

from google.appengine.ext import webapp, db
from google.appengine.ext.webapp.util import run_wsgi_app

class Blah(db.Model): pass

class MainPage(webapp.RequestHandler):
    def get(self):
        for i in xrange(3):
            db.put([Blah() for i in xrange(500)])  # can only put 500 at a time ...
        c = Blah.all().count()
        c10k = Blah.all().count(10000)
        self.response.out.write('%d %d' % (c,c10k))
        # prints "1000 1500" on its first run

application = webapp.WSGIApplication([('/', MainPage)])

def main(): run_wsgi_app(application)
if __name__ == '__main__': main()


As suggested in Issue 3671, you can set limit to None (instead of a higher than 1000 number, which is still useful to cap the count) if you want to count all records, although it's not recommended to do this and instead denormalize the counts in a transactions.

total_records = query.count(limit=None)


According to this App Engine blog post, the 1000-entity limit has only just been removed for count (and offset) in version 1.3.6. The limit had already been removed for fetch as of version 1.3.1. Upgrade to the latest version and the limit should be removed.

You do not need to cycle through results 1000 at a time (though you could, and it might even be more efficient); simply pass in the maximum number of results you'd like back:

    for m in MyModel.all().fetch(82000):
        # ...

In versions before 1.3.1, the number passed in had to be less than or equal to 1000.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜