Clever way of building a tag cloud? - Python
I've built a content aggregator and would like to add a tag cloud representing the current trends.
Unfortunately this is quite complex, as I have to look for keywords that represent the context of each article.
For example words su开发者_如何转开发ch as I, was, the, amazing, nice have no relation to context.
Help would be much appreciated! :)
Use NLTK, and in particular its Stopwords corpus:
Besides regular content words, there is another class of words called stop words that perform important grammatical functions, but are unlikely to be interesting by themselves. These include prepositions, complementizers, and determiners. NLTK comes bundled with the Stopwords corpus, a list of 2400 stop words across 11 different languages (including English).
NLTK can help you analyze the content in order to pick out relevant terms.
精彩评论