Google AppEngine Sharding Question
My background is in relational DB's and I am doing some experimenting with Google AppEngine primarily for learning. I want to build an "election" app where a user belongs to a state (CA, NY, TX, etc), they pick a party (Republican, Democratic, etc) and cast a vote for a particular year (2012 for now but the app could be reused in 2016).
I want a user to be able to see their voting history and maybe change it once for the current election. Also, I am going to require that users specify their zip code and think it would be nice to run some reports by state and/or zip code.
Using a relational DB, it seems you would create some tables like this:
Users(userid, username, city, state, zip)
UserVote(userid, year, vote)
And then use SQL to run reports. With the AppEngine datastore it seems that开发者_StackOverflow running aggregate reports is somewhat of a challenge.
My initial take would be to shard by User
where each user can contain a list of Votes
and then maybe double-save the aggregates elsewhere.
Any suggestions?
P.S. I have seen the AppEngine-MapReduce project, but am not sure if that would be overkill.
I dont remember exactly where I read this, but List properties in GAE become slow after they reach about 200 items. I would recommend against this in favor of the foreign key approach for Users and Votes.
Aggregates are a challenge since there are none of the common helper functions such as MAX, SUM, COUNT and so on. The best approach would be to store aggregates and counts in a separate datatype which you can query easily and update that every time a user makes a vote. Its easier in AppEngine to spend the time when you do the write so you can have faster queries later.
Here's a example of the objects in Java:
@PersistenceCapable
public class User{
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
...
}
@PersistenceCapable
public class Vote{
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
@Persistent
private Key userKey; // References a User
...
}
@PersistenceCapable
public class UserStats{
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
@Persistent
private Key userKey; // References a User
...
}
Also, traditional sharding doesn't make much sense in AppEngine since the underlying datastore is designed to handle queries on massive data sets with ease. The exception is if you have a specific counter that can be changed frequently and has a potential for multiple users changing it at the same time. This is a different type of sharding than you're used to in MySQL. Here is Google's article on sharding counters: http://code.google.com/appengine/articles/sharding_counters.html
精彩评论