How to identify tags (key words) automatically from a given text?
It should behave like Delicious toolbar for Firefox does; it lists possible tags to click. The effect is shown as below:
The code should be able to find key words for the text. Any good algorithm or open source project to recommend?
I found this post, but it is a开发者_JS百科 bit too general for my specific need.
I think you're looking for one of these answers,
- tag generation from a text content
- How to extract common / significant phrases from a series of text entries
- tag generation from a small text content (such as tweets)
In a nutshell - you're looking to extract unigrams from the text that somehow represent the concepts within it - a technique to do this is called Pointwise Mutual Information, which is illustrated with an example in the first two links. Using the Python NLTK framework (which already has a bunch of these algorithms built in) might be your best starting point to work off from.
Good luck!
精彩评论