How do the social media monitoring sites fetch the huge number of user posts?
There are many social media monitoring sites in the market. I am very curious 开发者_Python百科about how do the sites fetch the posts of such a huge number of users. How do they know which user's posts should be fetched?
For example, if one site needs me to log in with my Facebook account, and it just fetches/analyzes my or my friend's posts. That would be reasonable. But I tried several social media monitoring services several days ago, I found that there are massive amount of data fetched, users of all kinds are included.
How do the services know which user's data should they fetch? If they fetch all the posts of a certain social site, how do they achieve that? Doesn't the social site's API always prohibit apps from fetching data with large amount?
The application Social Radar is primarily crawler driven. This is similar to how the Google.com search engine works.
Google doesn't really worry about which users' content they're crawling, they just index what they can find. Content is typically structured in ecosystems so if you can find part of a conversation, you can often discover the rest of it as well. This is also true and helpful in the process of spam filtering.
APIs are leveraged as well, terms differ by service.
精彩评论