开发者

Regex for search queries

I have page designed in Django that has its own search engine. What I need help with is construction of regex that will filter only valid queries, which are consisting only of polish alphabet letters(both开发者_Go百科 upper- and lowercase) and symbols * and ? , can anyone be of assistance?

EDIT: I tried something like that:

query_re = re.compile(r'^\w*[\*\?]*$', re.UNICODE)
if not query_re.match(self.cleaned_data['query']):
    raise forms.ValidationError(_('Illegal character'))

but it also allows some invalid characters from different alphabets and wont allow *somest?ing* queries.


If your locale is correctly set, you would use

query_re = re.compile(r'^[\w\*\?]*$', re.LOCALE|re.IGNORECASE)

\w matches all locale-specific alphanumerics: http://docs.python.org/library/re.html


Try something like

regex = r'(?iL)^[\s\*\?a-z]*$'

assuming your machine's locale is Polish. The first part (?iL) sets the locale and ignorecase flags. The ^ matches the start of the string, \s matches any whitespace, and a-z any lowercase letter (or uppercase, thanks to the ignorecase flag).

Alternatively, instead of using (?L) and a-z, you could just explicitly list the allowable letters (e.g. abcdefghijklmnopqrstuvwxyz).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜