Is it possible to identify distinct words and a count开发者_运维百科 for each, from fields containing text strings in Postgres? Something like this?
I am working on a project to get Google search web pages and then clean HTML tags to obtain pure text 开发者_高级运维content.
Trying to write simple python script which will use NLTK to find and replace synonyms in txt file. Following code gives me error:
The title for this one was quite tricky. I\'m trying to solve a scenario, Imagine a survey was sent out to XXXXX amount of people, asking them what their favourite football club was.
i\'ve tried the following code and installed from http://code.google.com/p/hunpos/downloads/list english-wsj-1.0
Kindly suggest me a classifier that classifies the documents based on the requirements mentioned below.
Let\'s say I have a bunch of essays (thousands) that I want to tag, categorize, etc.Ideally, I\'d like to train something by manually categorizing/tagging a few hundred, and then let the thing loose.
Are there any open source/commercial libraries out there that can detect mailing addresses in text, just like how Apple\'s Mail app underlines addresses on the Mac/iPhone开发者_Python百科.
I would like to use JAPE/GATE to开发者_StackOverflow中文版 my own mother language (not English), as my documents are already tokenized and POS Tag.
I try to do named entity recognition in python using NLTK. I want to extract personal list of skills. I have the list of skills and would like to search them in requisition and tag the skills.