开发者

Regex? Search Engine?

I've read through some documentation on the re module that comes with built-in python, but I just can't seem to get a gr开发者_开发百科asp on it. In fact, I'm not exactly sure that is what I'm looking for, so let me explain:

I have a huge dictionary. What I want is to be able to type in a search criteria, let's say for example hello, and then have it search through the dictionary and give me a list like this:

hello, hell, hello world, hello123. Basically anything resembling the search criteria. Would I use regex for this or something else?


Since you are using Python, you should look at Xapian, it had great Python bindings.

What you are asking for is way more sophisticated that what regular expressions are for.

You need full text search, with stemming and other tricks to do the fuzzy matching.


You might want to look at something that can compute a Levenshtein (edit) distance. There's an excellent article here on how to build something like you are talking about from scratch (in Python! well and it has been ported to lots of other languages).

You might not want to go the "from-scratch" route, but the article will give you lots of interesting background that should help you decide which tool has the right level of sophistication for you. Xapian, as suggested above, Lucene, and other full-text search engines will provide this kind of capability, and it can be very sophisticated, but then again you might not need all that.


There is a new regexp module in PyPI repository (which will possibly replace the current Python re module sometimes).

It allows fuzzy matching.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜