Data modeling advice for a forum application on Google App Engine
I'm writing a simple forum-like application on Google App Engine and trying to avoid scalability issues. I'm new to this non-RBDMS approach, i'd like to avoid pitfalls from the beginning.
The forum design is pretty simple, posts and replies will be the only concepts. What will be the best approach to the problem if the forum have millions of posts?The model so far (stripped from useless properties):
class Message(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.SelfReferenceProperty() # if null is a post, if not null a reply (useful for reply-to-reply)
Splitting the model, i think it's faster because it will query less items when retri开发者_StackOverflow社区eving "all posts":
class Post(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
class Reply(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.ReferenceProperty(Post)
This is a many-to-one relation in a RDBMS world, should a ListProperty be used instead? If so, how?
Edit:
Jaiku uses something like this
class StreamEntry(DeletedMarkerModel):
...
entry = models.StringProperty() # ref - the parent of this, should it be a comment
...
Firstly, why don't you use user = db.UserProperty()
instead of user = db.StringProperty()
?
Secondly, I'm quite sure you should use whatever it works and is more readable and test the performance later, for three reasons:
- KISS (Keep it simple)
- Early optimizations are bad
- You can't improve what you can't measure
So when you are ready to measure, then start the optimizations.
I'm not saying this because I don't know nothing about RDBMS, No-SQL DBMS or Google Datastore performance optimizations, but because I usually get all my knowledge about it from testing, which seems to contradict previous assumptions more usually than I expected.
You might want to take a look at a good tutorial on creating a php forum from scratch. Sure that one is about PHP but it also covers the general overview of forum design.
Basically, don't split posts and replies or threads and posts. It will lead to some really awkward queries later on. A thread is simply a post that isn't replying to anything.
精彩评论