开发者

Find word with maximum number of occurrences

What is the most optimal way (algorithm) to search for the word that has the maximum number of occurrence开发者_运维知识库s in a document?


Finding the word that occures most times in a document can be done in O(n) by a simple histogram [hash based]:

histogram <- new map<String,int>
for each word in document: 
   if word in histogram:
      histogram[word] <- histogram[word] + 1
   else:
      histogram[word] <- 1
max <- 0
maxWord<- ""
for each word in histogram:
  if histogram[word] > max:
     max <- histogram[word]
     maxWord <- word
return maxWord

This is O(n) solution, and since the problem is clearly Omega(n) problem, it is optimal in terms of big O notation.


  1. Scan the document once, keeping a count of how many times you have seen every unique word (perhaps using a hashtable or a tree to do this).
  2. While performing step 1, keep track of the word that has the highest count of all words seen so far.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜