Improve App Engine performance by reducing entity size
The objective is to reduce the CPU cost and response time for a piece of code that runs very often and must db.get() several hundred keys each time.
Does this even work?
Can I expect the API time of a db.get() with several hundred keys to reduce roughly linearly as I reduce the size of the entity? Currently the entity has the following data attached: 9 String, 9 Boolean, 8 Integer, 1 GeoPt, 2 DateTi开发者_如何学Gome, 1 Text (avg size ~100 bytes FWIW), 1 Reference, 1 StringList (avg size 500 bytes). The goal is to move the vast majority of this data to related classes so that the core fetch of the main model will be quick.
If it does work, how is it implemented?
After a refactor, will I still incur the same high cost fetching existing entities? The documentation says that all properties of a model are fetched simultaneously. Will the old unneeded properties still transfer over RPC on my dime and while users wait? In other words: if I want to reduce the load time of my entities, is it necessary to migrate the old entities to ones with the new definition? If so, is it sufficient to re-put() the entity, or must I save under a wholly new key?
Example
Consider:
class Thing(db.Model):
text = db.TextProperty()
strings = db.StringListProperty()
num = db.IntegerProperty()
thing = Thing(key_name='thing1', text='x' * 10240,
strings = ['y'*500 for i in range(10)], num=23)
thing.put()
Let's say I re-define Thing to be streamlined and push up a new version:
class Thing(db.Model):
num = db.IntegerProperty()
And I fetch it again:
thing_again = Thing.get_by_key_name('thing1')
Have I reduced the fetch time for this entity?
To answer your questions in order:
- Yes, splitting up your model will reduce the fetch time, though probably not linearly. For a relatively small model like yours, the differences may not be huge. Large list properties are the leading cause of increased fetch time.
- Old properties will still be transferred when you fetch an entity after the change to the model, because the datastore has no knowledge of models.
- Also, however, deleted properties will still be stored even once you call .put(). Currently, there's two ways to eliminate the old properties: Replace all the existing entities with new ones, or use the lower-level api.datastore interface, which is dict-like and makes it easy to delete keys.
To remove properties from an entity, you can change your Model to an Expando, and then use delattr. It's documented in the App Engine docs here:
http://code.google.com/intl/fr/appengine/articles/update_schema.html
Under the heading "Removing Deleted Properties from the Datastore"
if I want to reduce the size of my entities, is it necessary to migrate the old entities to ones with the new definition?
Yes. The GAE data store is just a big key-value store, that doesn't know anything about your model definitions. So the old values will be the old values until you put new values in!
精彩评论