开发者

Realtime Socket.IO scaling problem - python

I'm trying to do something like the stream on Facebook, with socket.io 0.6 and tornadio.

Each user has is own comet channel/group in his wall. I'm sending a comet message to the wal开发者_运维百科l of all my friends (even if they aren't online).

The problem is about scaling: what if i have 1 million friends? It would take a long time to write in all walls.

Is there any solution more efficient to do this using comet?


This is a difficult problem in the social space. There is a trade-off between two approaches:

  • push: When a user produces an event (e.g. a status update), you push that status update out to the stream of each of the user's friends. When a user loads his or her stream, you only have to read a record from a single place.
  • pull: When a user produces an event, you write that even to the user's data record. When a user loads his stream, you poll the data record of each of his friends, aggregating the results on the fly.

The push method is good when loading a stream happens much more often than user updates and when the "fanout" of users (e.g. the maximum number of followers a user has) is low. The pull method is good when a user loading his stream is rare, or if the the number of users a user can follow is low.

I co-authored a paper on how to do this efficiently. Basically, we used a hybrid method, determining when to push or pull based on user statistics.

For simplicity, I would recommend you implement the pull model. Cache the results of the aggregation and only refresh a user's feed after the cache entry is stale for a certain period of time.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜