开发者

term frequency using java program

I have set of documents. I want to know the frequency count of each wor开发者_开发百科d in each document (i.e) term frequency using java program. thanks in advance. I know how to find the frequency count for each word. My question is about how to take the unique words in each document from the list of documents


You can split your documents on spaces and punctuation, go through the resulting array and then count frequency for each word (a Map<String, Integer> would really help you with this).


Resources :

  • Java - faster data structure to count word frequency?

On the same topic :

  • How to count words in java


If it's more than a one time problem to solve, you should consider using Lucene to index your documents. Then this post would help you answer your question.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜