开发者

Google AppEngine Sharding Question

My background is in relational DB's and I am doing some experimenting with Google AppEngine primarily for learning. I want to build an "election" app where a user belongs to a state (CA, NY, TX, etc), they pick a party (Republican, Democratic, etc) and cast a vote for a particular year (2012 for now but the app could be reused in 2016).

I want a user to be able to see their voting history and maybe change it once for the current election. Also, I am going to require that users specify their zip code and think it would be nice to run some reports by state and/or zip code.

Using a relational DB, it seems you would create some tables like this:

Users(userid, username, city, state, zip)
UserVote(userid, year, vote)

And then use SQL to run reports. With the AppEngine datastore it seems that开发者_StackOverflow running aggregate reports is somewhat of a challenge.

My initial take would be to shard by User where each user can contain a list of Votes and then maybe double-save the aggregates elsewhere.

Any suggestions?

P.S. I have seen the AppEngine-MapReduce project, but am not sure if that would be overkill.


I dont remember exactly where I read this, but List properties in GAE become slow after they reach about 200 items. I would recommend against this in favor of the foreign key approach for Users and Votes.

Aggregates are a challenge since there are none of the common helper functions such as MAX, SUM, COUNT and so on. The best approach would be to store aggregates and counts in a separate datatype which you can query easily and update that every time a user makes a vote. Its easier in AppEngine to spend the time when you do the write so you can have faster queries later.

Here's a example of the objects in Java:

@PersistenceCapable
public class User{
    @PrimaryKey
    @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
    private Key key;
    ...
}

@PersistenceCapable
public class Vote{
    @PrimaryKey
    @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
    private Key key;

    @Persistent
    private Key userKey;  // References a User
    ...
}

@PersistenceCapable
public class UserStats{
    @PrimaryKey
    @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
    private Key key;

    @Persistent
    private Key userKey;  // References a User
    ...
}

Also, traditional sharding doesn't make much sense in AppEngine since the underlying datastore is designed to handle queries on massive data sets with ease. The exception is if you have a specific counter that can be changed frequently and has a potential for multiple users changing it at the same time. This is a different type of sharding than you're used to in MySQL. Here is Google's article on sharding counters: http://code.google.com/appengine/articles/sharding_counters.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜