开发者

voting DB model, performance

for example a website where users can vote on questions (up and down) with this model:

    users(id)

    questions(id)

    votes(id,userId,questionid,vote)  
-- vote can be +1 or -1, or probably better bit 0 and 1

I'm thinking that when there are going to be lots of questions with lots of votes, there are going开发者_StackOverflow中文版 to be performance issues, especially when showing the list of questions, so

would it make sense to add a column to the questions like this:

questions(id, votessum)

and each time somebody votes besides doing an insert into the votes questions also to update the questions and set it's votessum column


I would say that no, out of the gate I would not have such a column on the question table. I would determine via actual or estiamted usage plus performance testing whether the queries required to aggregate this data would become a performance bottleneck.

If the query did create a bottleneck, I would first look at other optimizations, like improving indexes and materialized views before doing this kind of denormalization.

Denormalization is sometime useful (and sometimes necessary), but it has its own problems and pitfalls that make fully exploring other options worthwhile.


I would worry more about accuracy and capturing all data first before I worry about performance.

I would create a separate table called QuestionVotes with columns as QuestionId, UserId, VoteType, TimeStamp to capture all related activity.

This would help to track all votes, their type, who cast the vote, to prevent double votes, etc.

Now I may consider updating the count in the Questions table if the performance degrades, but I would go the route of indexing and query tuning first.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜