Confusion in finding the order of the data structure

2022-12-21 05:06 问答作者：

Today I attended a written test conducted by a company. The overall test was focussed on data structures. I got a problem which I thought I solved. But I had a tough time in calculating the Big O function for the data structure. I will provide the question and the answer I came up with.

Given a document you need to store and the words in the document and should be able to return the count when any word is entered. You are provided with char* GetNextWord().

What Data structure will you choose

Give the Algorithm

What will be the order of your algorithm

For question 1, I wrote I will go for TRIE data structure. For question 2, I gave a brief algo开发者_如何学Gorithm. I wrote I would construct the TRIE data structure as following.

struct TRIE{
 boolean isWord;
 int count;
 Node* myList;
}

struct Node{
 char* character;
 Node *next;
 TRIE *child;
}

I have methods constructTrie() which will do a addToTrie() for each word.

I wrote the order of addToTrie() would be O(k) where k is the length. And the order of constructTrie() would be N*O(k) where N would be the number of words.

Now my question is this: Whether the orders I've mentioned is correct or not? If not, how to attack problems like this in the future (given a ds finding the order). I got really confused after using O(k). It makes me assume O(1).

Hints/Tips/Advise are wide open!!

Edit : Corrected the question clearly mentioning that the word count should be stored for all unique words.

Comparing two generic strings take Θ(k) (k = min strlen), and the number of words is N which you have to look through, so Ω(Nk) should be the most efficient complexity you can get.

If you really want to use a trie, then addToTrie() would indeed be O(k) where k is the length of the word you are adding. constructTrie() would take O(Nk) where N is the number of words, if you just call addToTrie() for every word. However, you don't need to call the addToTrie() function for every word. Once you have finished adding a word, just reset a trie pointer to the trie's root, then move the pointer as you are moving over your current word, adding the characters as you go along. Pseudocode:

trieNode *curr = trieRoot;
for each character c in document
  if it's a word terminator (space etc)
    add a character at curr signaling the end of the current word ('\0' maybe);
    curr = trieRoot;
  else if character is not a separator
    add character c at curr->next->character[c];
    curr = curr->next;

This will give you O(C) running time for constructing the trie, where C is the number of characters in your document.

Now, this begs the question: why do you need the trie at all? Obviously you figured out a way to detect when a word has ended, so why must you add your words to a trie? It's overkill. The only datastructure you need is a few variables: one to keep track of the current character, one to keep track of the previous character and one to count the words. This is easily done in O(C) like this:

char prev = '\0';
char curr;
int count = 0;

for each character curr
  if curr is a word separator and prev isn't 
    ++count;
  prev = curr;

I think it makes no sense to use a trie for this problem, it's only complicating things. I think if they wanted to test your knowledge of tries they would have given you a problem where a trie made more sense.

Even if they gave you a getNextWord() function (did you HAVE to use it? because you can do better without it), I'm guessing it returns "\0" or something when there are no more words? So why can't you just call it until it returns "\0" and count the words like that? Either way, a trie doesn't really make sense here.

继续阅读：algorithm data-structures trie

Confusion in finding the order of the data structure

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？