interpreting Naive Bayes results
I start using NaiveBayes/Simple classifier for classification (Weka), however I have some problems to understand while training the data. The data set I'm using is weather.nominal.arff.
While I use use training test from the options, the classifier result is:
Correctly Classified Instances 13 - 92.8571 %
Incorrectly Classified Instances 1 - 7.1429 %
a b classified as
9 0 a =yes
1 4 b = no
My first question what should I understand from the incorrect classified instances? Why such a problem occurred? which attribute collection is classified incorrect? is ther开发者_开发技巧e a way to understand this?
Secondly, when I try the 10 fold cross validation, why I get different (less) correctly classified instances?
The results are:
Correctly Classified Instances 8 57.1429 %
Incorrectly Classified Instances 6 42.8571 %
a b <-- classified as
7 2 | a = yes
4 1 | b = no
You can get the individual predictions for each instance by choosing this option from:
More Options... > Output predictions > PlainText
Which will give you in addition to the evaluation metrics, the following:
=== Predictions on training set ===
inst# actual predicted error prediction
1 2:no 2:no 0.704
2 2:no 2:no 0.847
3 1:yes 1:yes 0.737
4 1:yes 1:yes 0.554
5 1:yes 1:yes 0.867
6 2:no 1:yes + 0.737
7 1:yes 1:yes 0.913
8 2:no 2:no 0.588
9 1:yes 1:yes 0.786
10 1:yes 1:yes 0.845
11 1:yes 1:yes 0.568
12 1:yes 1:yes 0.667
13 1:yes 1:yes 0.925
14 2:no 2:no 0.652
which indicates that the 6th instances was misclassified. Note that even if you train and test on the same instances, misclassifications can occur due to inconsistencies in the data (the simplest example is having two instances with the same features but with different class label).
Keep in mind that the above way of testing is biased (its somewhat cheating since it can see the answers to the questions). Thus we are usually interested in getting a more realistic estimate of the model error on unseen data. Cross-validation is one such technique, where it partition the data into 10 stratified folds, performing the testing on one fold, while training on the other nine, finally it reports the average accuracy across the ten runs.
精彩评论