How do large sites(Google, Facebook, etc) propagate information to all servers in realtime?
I'm looking for some technologies to research. I'm amazed that you can go into [insert large site here]'s interface, update a setting and within seconds it's pushed out so it's live across the board. A good example of thi开发者_Go百科s is adwords. If you go into adwords and change a campaign those settings are stored on the server with a unique id. The ad code calls the server with that id and the information(size,colors, etc) is pulled up instantly to show the ad. How is that Google can push that out to hundreds of thousands of servers so quickly? What type of db systems are they using?
Google has published research papers for its Google File System (or "BigFiles" as it was once known) and BigTable, both of which are used extensively in their services. Those would probably make good reading, in and of themselves and because they probably cite prior art.
You might want to read how Oracle has built RAC to propagate data across many DBs: http://download.oracle.com/docs/cd/B14117_01/server.101/b10727/ha_strea.htm
I know that Facebook use peer-to-peer to push update on their server.
The first server get the update, then he send it to some others who does the same thing.. and on until the update is on all of their server!
I have been looking into similar pieces of information.
Look for "Structured Data".
Specifics: MojoDB, CouchDB. Look for comparisions on mojodb website.
Facebook has made Cassandra (distributed database) open source. I think they and many others use it now.
Also look for Hadoop framework and Map/Reduce, as a matter of interest.
精彩评论