开发者

how to store 2 billion users?

There is a portal with two billion users registered. If you store all the 2 billion users in a conventional databases it will take more time to retrieve the data about a particular user when that user tries to login. How do you handle this situ开发者_如何学运维ation to make sure that the user gets the response quickly.


I don't see any particular reason why a conventional database on decent modern hardware couldn't retrieve log-on information pretty quickly, even if you have 2 billion records. It's just a simple indexed lookup after all (you did remember to index on user ID, right?)

On a really big machine you might even fit most of it in RAM.

However, if you are really trying to engineer this for scale I'd look at something like Cassandra. This is a highly available, distributed NoSQL database, basically the same kind of architecture that Google, Facebook etc. would use.


I dont know if its practical, but in theory you could use some sort of tree structure. If I remember my CS classes from a long time ago, trees are O(ln), so for a billion (which is 2^30), you only ever need 30 operations for a lookup. Thats the beauty of CS....

Implementing a tree structure for that, i have no idea.


If you have a portal of 2 billion users, login is such a small amount of all the queries that will be performed.
The problem here is not the time it takes for 1 login, but what if one percent of all users is active at the same time.
Luckely two billion users do not fit into one continent, so you can use distributed database servers, that each serve their own part of the world. And you can synchronize them in the background (in case somebody travels to another continent).

If you have the resources (time, money, staff) you can invent your own bigtable database like google did (with 2 billion user you probably have money and staff), but I would stick with the normal relational databases to implement this.


As of the mentioned scenario, say if feasible... Even making usage of any kind of advanced data structure won't help. because, practically when you are creating a tree (say for the mentioned scenario..) it indirectly induces at least 2 linkage pointers for each and every node... that is kind'a hypocritical phenomenon to maintain for the order of billions.

One of the kind, we can rationalize the users, simply decentralize the users according to some sorting scenario (say based on the order of alphabets in their usernames)... as of we can say the username will be the inducing power behind the user's instance running behind.. if so, we will be maintaining separate database for every alphabet and these data bases will be communicating through some centralized commanding tree so that if request fires will automatically receives the resolution from the server..

next of the kind, maintain location of the user as an external parameter so that, the users of one region may be mapped along a sub network which was in return connected to a centralized commanding gateway.. (kind'a similar scenario to that of The Internet...)

In my point of view, this is a bit practical as of to mention the procedures of META, managing its user's info in a nutshell.. to get clarity try to observe the facebook's working architecture on managing nodes in its network.....

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜