Defining Trending Topics in a specific collection of tweets
Im doing a Java application where I'll have to determine what are the Trending Topics from a specific collectio开发者_如何学编程n of tweets, obtained trough the Twitter Search. While searching in the web, I found out that the algorithm defines that a topic is trending, when it has a big number of mentions in a specific time, that is, in the exact moment. So there must be a decay calculation so that the topics change often. However, I have another doubt:
How does twitter determines what specific terms in a tweet should be the TT? For example, I've observed that most TT's are hashtag or proper nouns. Does this make any sense? Or do they analyse all words and determine the frequency?
I hope someone can help me! Thanks!
I don't think anyone knows except Twitter, however it seems hashtags do play a big part, but there are other factors in play. I think mining the whole text would take more time than needed, and would result in too many false positives.
Here is an interested article from Mashable:
http://www.sparkmediasolutions.com/pdfs/SMS_Twitter_Trending.pdf
-Ralph Winters
You may be interested in meme tracking, which as I recall, does interesting things with proper nouns, but basically identifies topics in a stream as they become more and less popular:
And in Eddi, interactive topic-based browsing of social status streams
精彩评论