开发者

Optimal algorithm to follow all Twitter users

What is an optimal algorithm to follow all Twitter users using the twitter API? I have been wrapping my mind about this issue and I cannot find any optimal 开发者_StackOverflow社区iterative approach to this. Thanks in advance for any suggestions.


Besides the case of "why would you do such a thing?" and "this will get your IP banned", etc.

This shouldn't be all that different from writing a web crawler. I would start off by finding a few root sources and throwing their follows/followers into a priority queue ordered by number of follows/followers the user has, ignoring follows/followers you've already visited. Then visit the users using the priority queue to find the user with the most new follows/followers, keeping the pq updated as you go along.

Again, this sounds like a terrible idea to implement in practice. Twitter had 190 million users in July 2010!


As long as you have a theoretical machine, so time and number of API calls doesn't matter, the solution is simple. Every user has a unique id. A user I am following who created his account last week has an id of 229,863,592, so let's use 250,000,000 as the theoretical end point. You can start with an ID of 1, and use the API to follow each user from 1 to 250000000. Anyone who has deleted their account or has been suspended will return an error when you try to follow them. The Twitter API for following 5,000 users at a time by id is:

http://dev.twitter.com/doc/post/friendships/create

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜