开发者

Google App Engine Performance - Checking Existence of an Object

I am coding a system using Google App Engine and I need to put an object in the datastore only if it doesn't exist yet. I would be fine using the datastore.put() method, except I need to know whether that object already existed to count the number of new objects I have.

As far as I know I have the following options (suppose I have the key both as an attribute and as the entity key):

private Entity getEntity(String key)
{
    DatastoreService datastore =
        DatastoreServiceFactory.getDatastoreService();

    // Build a query to select this entity from the database:
    Query q = new Query("MyEntity");
    q.setKeysOnly();
    // Add a filter for the key attribute:
    q.addFilter("key", Query.FilterOperator.EQUAL, key);
    PreparedQuery pq = datastore.prepare(q);
    // Select a single entity from the database
    // (there should be no more than one matching row anyway):
    List<Entity> list = pq.asList(FetchOptions.Builder.withLimit(1));

    if (!list.isEmpty())
        // Return the found entity:
        return list.get(0);
    else
        return null;
}

or

private Entity getEntity(String key)
{
DatastoreService datastore =
    DatastoreServiceFactory.getDatastoreService();

    // Get a key that matches this entity:
    Key key = KeyFactory.createKey("MyEntity", key);

    try 开发者_如何学JAVA{
        return datastore.get(key);
    } catch (EntityNotFoundException e) {
        // Entity does not exist in DB:
        return null;
    }
}

I'm inclined to use the second one as it seems more straight forward, but I'm worried it might not meant to be used that way since it raises an exception, and it may incur overhead.

Which of the methods are better for checking whether an entity exists in the database?

Is there a better way to do that?


Doing a get will be faster unless your entity is large and has many properties - in which case the keys only query is likely to be faster. If performance is likely to be a significant issue here, I would recommend benchmarking to test - but if not, the latter approach is more straightforward.


If uniqueness is required for an Entity, even this check won't guarantee uniqueness if there's multiple threads accessing the database at the exact same time.

In this case, both threads would see nothing exists, and create new objects simultaneously. Even a transaction can't protect against this occurring since the application won't block access between the read to determine uniqueness and the write to save the Entity.

I know it doesn't sound likely, but this has definitely happened to us, such as when we've run MapReduce jobs to update/create a big batch of records (100k+) over 8 shards in batch.

The only way to guarantee the objects are unique is to specify their Key's name property. This will make the datastore create a new entity if one doesn't exist, else it will update the entity to the last-saved object.

So rather than:

Entity entity = new Entity("MyKind");

this ensures only one unique Entity per said property:

String myPropertyValue = getPropValue();
Entity entity = new Entity("MyKind", myPropertyValue);
ds.put(entity); // Ensures only one Entity per this property value
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜