开发者

autocomplete-like feature with a python dict

In PHP, I had this line matches = preg_grep('/^for/', array_keys($hash)); What it would do is it would g开发者_Go百科rab the words: fork, form etc. that are in $hash.

In Python, I have a dict with 400,000 words. It's keys are words I'd like to present in an auto-complete like feature (the values in this case are meaningless). How would I be able to return the keys from my dictionary that match the input?

For example (as used earlier), if I have

my_dic = t{"fork" : True, "form" : True, "fold" : True, "fame" : True}

and I get some input "for", It'll return a list of "fork", "form".


>>> mydict={"fork" : True, "form" : True, "fold" : True, "fame" : True}
>>> [k for k in mydict if k.startswith("for")]
['fork', 'form']

This should be faster than using a regular expression (and sufficient if you're just looking for word beginnings).


So this isn't a direct answer to what you ask, but..

It seems like you don't really want a dict for this sort of thing, you're looking for a tree-like structure, right?

Then you can walk the tree for each letter that is typed (constant time), and return leaves from that subsection of the tree as the words that match that prefix.


>>> my_dict = {"fork" : True, "form" : True, "fold" : True, "fame" : True}
>>> import re
>>> [s for s in my_dict if re.search('^for', s) is not None]
['fork', 'form']

Use of regex is more universal as you could provide more complex search patterns, if it's only about prefixes, you could use string methods: str.startwith, for example:

>>> [s for s in my_dict if s.startswith('for')]
['fork', 'form']


If you want a specific lookup strategy (such as the "startswith 3 chars" outlined above), you could probably get a quick win by creating a specific lookup dictionary based around that idea.

q = {"fork":1, "form":2, "fold":3, "fame":4}
from collections import defaultdict
q1 = defaultdict(dict)
for k,v in q.items():
    q1[k[:3]][k]=v

This would let you do a .startswith type lookup over a much smaller set

def getChoices(frag):
    d = q1.get(frag[:3])
    if d is None:
        return []
    return [ k for k in d.keys() if k.startswith(frag) ]

Hopefully that should be a lot quicker than processing the whole 400,000 keys.


You can get the keys from my_dict with my_dict.keys(). Then, you can search through each key to see if it matches your regular expression.

m = re.compile('^for')
keys = []
for key in my_dict.keys():
   if m.match(key) != None:
      keys.append(key)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜