python: fast dictionary word lookup with wildcards*
Given a text, which is split into a list of words, I want to lookup each of the words in an dictionary of words, which too is read from a text-file and split('\n')
.
Rather than checking if each word is contained in the dictionary (which is gruesomely slow) I need to select a list of elements based on wildcards* ('*' is at the end i.e. no permuterm solution required). For instance, the solution should select all dictionary elements starting with 'dep', without traversing the entire dictionary list.
Performance开发者_如何学Go is of the essence in this case. I though of a Btree...but
- What would be the best package and data-type for a fast implementation in Python.
- Please provide code examples
Use a dawg, which is more efficient than a Trie in terms of space waste. There are a few python implementations, but for a start take a look here.
You want a trie. Use the PyTrie package.
精彩评论