开发者

Represent a document to a vector by Lucene.

I want to build document vector for SVM text categorization. I have indexed m开发者_JAVA百科y documents to 2 POSITIVE and NEGATIVE documents. And I selected my features space with IG method.

How can I represent a documents become a vector with tf-idf weight term by Lucene.

Thanks !

Best regard!


Apache Mahout is a machine learning library in Java. It has utilities to create document vectors from lucene index (created from raw text). You can adopt the code as per your requirement.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜