开发者

What is the most performant way to store a list of Tuples in App-Engine?

When storing and retrieving a datastore entity that contains a list of tuples what is the most efficient way of storing this list?

When I have encountered this problem the tuples could be anything from key value pairs, to a datetime and sample results, to (x, y) coordinates.

The number of tuples is variable and ranges from 1 to a few hundred.

The entity containing these tuples, would need to be referenced quickly/cheaply, and the tuple values do not need to be indexed.

I have had this problem a few times, and have solved it a number of different ways.

Method 1:

Convert the tuple values to a string and concatenate them together with some delimiter.

def PutEntity(entity, tuples):
  entity.tuples = ['_'.join(tuple) for tuple in tuples]
  entity.put()

Advantages: Results are easily readable in the Datastore Viewer, everything is fetched in one get. Disadvantages: Potential precision loss, programmer required to deserialize/serialize, more bytes required to store data in string format.

Method 2:

Store each tuple value in a list and zip / unzip the tuple.

def PutEntity(entity, tuples):
  entity.keys = [tuple[0] for tuple in tuples]
  entity.values = [tuple[1] for tuple in tuples]
  entity.put()

Advantages: No loss of precision, Confusing but still possible to view data in Datastore viewer, Able to enforce types, Everything is fetched in one get.

Disadvantage: programmer needs to zip / unzip the tuples or carefully maintain order in the lists.

Method 3:

Serialize the list of tuples in some manor json, pickle,开发者_运维百科 protocol buffers and store it in a blob or text property.

Advantages: Usable with objects, and more complex objects, less risk of a bug miss matching tuple values.

Disadvantages: Blob store access requires and additional fetch?, Can not view data in the Datastore Viewer.

Method 4:

Store the tuples in another entity and keep a list of the keys.

Advantages: More obvious architecture. If the entity is a view, we no longer need to keep two copies of the tuple data.

Disadvantages: Two fetches required one for the entity and key list and one for the tuples.

I am wondering if anyone knows which one performs the best and if there is a way I haven't thought about?

Thanks, Jim


I use Method 3. Blobstore may require an extra fetch, but db.BlobProperty does not. For objects where it is important that it comes out of storage exactly as it was put in I use PickleProperty (which can be found in tipfy, and some other utility libraries).

For objects where I just need its state stored I wrote a JsonProperty function that works similarly to PickleProperty (but uses SimpleJson, obviously).

For me getting all data in a single fetch, and being idiot-proof, is more important than cpu performance (in App Engine). According to the Google I/O talk on AppStats, a trip to the datastore is almost always going to be more expensive than a bit of local parsing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜