Novice student question: Is creating a stackoverflow-esque backend as simple as 2 db tables?
I've long been perplexed by the speed of stackoverflow and how quickly the questions/comments load on the page. It seems like the backend db that stores all of this info would be humongus...How is it possible for a question and all of its associated answers to get loaded so quickly?
I've never worked in a large-scale db environment before (my background is small-business db like Access, some MySQL)...but I'd imagine the backend db for stackoverflow (simplified) is something like two tables linked by an indexed key, right? Something akin to:
Question Table: Question_PrimaryKey | QuestionText
Answer Table: Answer_PrimaryKey | Question_ForeignKey | AnswerText
(linked at Question_PrimaryKey & Question_ForeignKey).
Am I way off in thinking this is how a site like stacko开发者_如何学运维verflow is set up? If so, how on earth are the answers to these questions fetched so quickly and put through to the browser? (it blows my mind, because when I build small intranet sites that use Access as a backend, the performance really starts to deteriorate when the db grows).
Any input would be greatly appreciated. Thanks for your time!
Good web performance obviously depends on a streamlined and well tuned database, but it is more to do with caching - basically storing frequently accessed data in memory, rather than have to pull it from a database on every request.
This blog post talks about SO's architecture.
Optimizing Your Website with Jeff Atwood and Stackoverflow
It's the return of Jeff Atwood. He and the team have been making lots of great speed optimizations to Stackoverflow lately. What tools are they using? What kinds of speed improvements are they seeing, and what can you do to exploit their experience?
And, from a hardware architeture standpoint: High Scalability - Stack Overflow Architeture
In an extremely simplified way, yes the backend will be like that, though the schema will be more complex involving more table and relationships.
StackOverflow uses SQL Server 2008 which, while similar in many respects to access, is on a whole new level in terms of sophistication and performance. As far as databases go, they don't get much worse than access (someboday will probably correct me on that).
The performance is very good, but that will be the result of a lot of performance tuning, carefully optimised queries, indexes, schema, partitions etc, along with a lot of caching.
This site actually provides a dump of the data. You can have a look to see how it's put together.
精彩评论