need to count the frequency of each terms inside a document
i need to calculate the frequency of all the terms inside a document. How can i do that ? i do not ask for codes. I am just asking for guidance. Actually i am doing some similarity calculation between a document and query. I have calculated the term frequency开发者_JAVA技巧 for the query. But i do not know how to calculate the tern frequency for EACH words inside a document. Can anyone guide me ? Thank you for your attention.
You can use a HashMap, where key is your term and value - the frequency of it. Each time you see you term you increase the value. After the file is done you have your numbers.
Yes, use the HashMap to save the values and to go through the file, you can use a Scanner
In Java you should definitely stay with HashMap<String, Integer>
. The terms will be the HashMap keys and the term frequency the value.
精彩评论