开发者

need some suggestions on my SVM feature refinement

I've trained a system on SVM,that is given a question,whether the webpage is a good one for answering this quest开发者_JAVA技巧ion.

The feature I selected are "Term frequency in webpage","Whether term matches with the webpage title", "number of images in the webpage", "length of the webpage","is it a wikipedia page?","the position of this webpage in the list returned by the search engine".

Currently,my system will maintain a precision around 0.4 and recall at 1.It has a large portion of false positive error(that many bad links were classified as good link by my classifier).

Since the accuracy could be improved a bit,I would like to ask for some help here on considering refine the features that I selected for training/testing,could remove some or adding more in there.

Thanks in advance.


Hmm...

  • How large is your training set? i.e., how many training documents are you using?
  • What is your test set composed of?
  • Since you're getting too many FPs, I would try training with more (and varied) "bad" webpages
  • Can you give more details about your different features, like "tf in webpage," etc.?
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜