开发者

Autocompletion library in C++

I need an auto-completion routine or library in C++ for 1 million words. I guess I can find a routine on the net like Rabin–Karp. Do you know a library that does this. I don't see it in Boost.

Also, is it a crazy idea to use MySql LIKE SQL request to do that ?

Thank you

EDIT: It is true that it is more suggestions than auto-completion that I need (propose ten words when the user typed the first 2 letters). I actually also have expressions "Nikon d开发者_StackOverflow中文版igital camera". But for a first version, I only need suggestions on "Ni" of Nikon and not on "digital camera".


You don't have to use any crazy algorithm if you begin by preparing an index.

A simple Trie/Binary Search Tree structure, that keeps the words ordered alphabetically, would allow efficient prefix searches.

In C++, for example, the std::map class has the lower_bound member which would point in O(log N) to the first element that could possibly extend your word.


hmmmm, if you're thinking about using like, it means that most probably, you want to have classical autocompletion (begin of word is matching).

What about organising (nicely) your data into a 26-tree (one entry per letter, or if you support other than letters, an well chosen x-tree). That way, you organize your data once and then, you have quick result by tree parsing. if you want to limit the amount of results proposed into your autocompletion, you can adapt your tree parsing algorithm. Seems simple and efficient (a like syntax in SQL will have to compare all your items in your table each time, whereas my solution is much quicker once the data is correctly set)

Other solution, you can peek at Qt implementation of QCompleter (might be overkill to depend on Qt on your code, I don't know)


I worked on a project once that did something like this using CLucene. It worked fine.


You can use a trie (prefix tree) to store your words.

struct trie
{
  std::map<char, trie*> next;
  bool is_word;

  void insert(std::string w)
  {  
    trie * n = this;
    for (int i = 0; i < w.size(); ++i) {
      if (n->next.find(w[i]) == n->next.end()) {
        n->next[w[i]] = new trie();
      }
      n = n->next[w[i]];
    }
    n->is_word = true;
  }
};

Then you can easily get prefix matches iterating on subtrees.


You could write your own simple auto-completion function with using Damerau-Levenshtein distance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜