Document Classification
Kindly suggest me a classifier that classifies the documents based on the requirements mentioned below.
I have set of documents which are to be classified. For each classification label, I have the set of terms tha开发者_JAVA百科t are specific to that class label.
Well, if you already have the terms for your classes you can use some different kinds of classifiers, e.g. a SVM, a Naive Bayes Classifier or even a Neural Network.
There are some libraries out there which include this classifiers, like weka or mahout.
Recetly I wrote an example how to do this with a Naive Bayes Classifier: Naive Bayes Example, but this is rather an explanation of the concept and no real-world-usable tool.
As you have labels attached to document, this come under supervised learning. You can use any of the below classifiers to achieve document classification. 1. Naive Bayes classifier 2. Nearest Neighbourhood classifier 3. Decision trees 4. Subspace method
Most of the ml libraries will have implementations for the above techniques. You can refer to this link, if you want to choose which ml library based on the programming language you are comfortabl with. http://daoudclarke.github.io/machine%20learning%20in%20practice/2013/10/08/machine-learning-libraries/
精彩评论