AppEngine data strategy to handle a large index per user?
I’m building an AppEngine app in Python.
For the sake of discussion, imagine I’m building a Gmail clone. Except with a million short emails per user.
The point is, each user will have a large search index, all to theirself; just like Gmail, each user has a personal “search engine” of their own content.
Now imagine that many of these messages belong to multiple users (e.g. mailing list emails or cc:开发者_运维问答ing a hundred users). Not all, but some reasonable fraction.
Without prematurely optimizing, what is my best bet to store the data and the indexes?
How about storing a list of User keys in each mail message? That's assuming that a single message won't be owned by more than a hundred or so users.
class User(db.Model):
"usual properties like name, etc"
class Message(db.Model):
# list of users that have this message
users = db.ListProperty(db.Key)
If you want an unlimited number of user * message relationships, you could use another table:
class UserMessage(db.Model):
user = db.ReferenceProperty(User)
message = db.ReferenceProperty(Message)
here's a couple of good articles on modeling relationships like these on GAE:
http://code.google.com/appengine/articles/modeling.html http://blog.notdot.net/2010/10/Modeling-relationships-in-App-Engine
class User(db.Model):
pass
class Message(db.Model):
text = db.StringProperty()
class MessageIndex(db.Model): # parent is a Message
users = db.StringListProperty() #users keys
class UserIndex(db.Model): # parent is an User
messages = db.StringListProperty() #messages keys
Take a look here or read the pdf.
精彩评论