PHP fetch all Twitter Followers and compare them to friends

2022-12-25 06:52 问答作者：

I am looking for scalable way to do the following:

User login
Fetch all Friends from Twitter
Fetch all Followers from Twitter
Display all Friends which aren't Followers

The Problem: How can this be done in a scalable way? An user can have up to 2 million friends or followers. Currently I'm storing both inside an SQLite table and compare them through a loop. When the user comes back the table is cleared and process starts again.

This works fine on 100 - 1000 Friends, but will be tricky with 500000 Friends. I can't cache the lists because they can change ever开发者_JS百科y moment.

Does anyone know a good way to handle such big amount of data?

I don't know what your database looks like, but this is how I would set it up.

CREATE TABLE twitter_users (
    user_id INTEGER PRIMARY KEY NOT NULL,
    screen_name VARCHAR(20) NOT NULL
);

CREATE TABLE friends (
    friend_id INTEGER PRIMARY KEY NOT NULL
);

CREATE TABLE followers (
    follower_id INTEGER PRIMARY KEY NOT NULL
);

Then you can use this SQL to get the friends who are not followers.

SELECT friend_id, screen_name
FROM friends
LEFT JOIN followers ON follower_id = friend_id
LEFT JOIN twitter_users ON user_id = friend_id
WHERE follower_id IS NULL

If the screen name is NULL it means they are not in your twitter_users table. You can look up the missing users and store them for later. Screen names can change so you might need to update the table periodically.

Use the friends/ids and followers/ids APIs to get a list of friend and follower ids 5,000 at a time. Use the users/lookup API to get up to 100 screen names. If a user has 2,000,000 friends it will take 400 api calls to get the list of ids so you should still cache the list at least for popular users.

Another thing to point out - do you need to display all friends that aren't followers at one time? If you only need to display a limited number at a time, 20 for example, then you can just calculate those 20; if they request more, then calculate more on the fly (or do it in the background as they browse your site; on each request, generate a few more).

I can't really imagine a situation where you would need to display a couple of million results in one page, even if that's the theoretical limit.

So, the approach that might work (from having a brief browse at their API documentation) would be to

grab a chunk of their friends (it appears that you get 100 per request anyway) using the statuses/friends API
for each retrieved friend
- use the friendships/show to determine the follower status between the two
- if you've got enough results (e.g. 20) then break, you're done

This approach does require more requests to the server than is permitted by twitter's rate limiting policies, but then again, getting the entire friend list of a user with 2,000,000 friends at 100 friends per request will also exceed the limit well before you get them all (150 requests x 100 per request = 15, 000). How do you plan to address this problem?

Not the only way to do this, but effective: Run a crontab to download a list of twitter users every day from a site that has a public list (or twitter itself), then index those friends (run maybe 1000 every day). Then access the twitter API through PHP using cUrl to retreive a list of your friends- and match the arrays. This works well because you can improve your algorithm as you go- as noted above the limiting policies will prevent you from doing anything else. Good luck! =)

继续阅读：php twitter

PHP fetch all Twitter Followers and compare them to friends

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？