开发者

Regex match for a non-english language in Python

I'm trying to capture and match russian language chara开发者_Python百科cters in a python script. Since russian characters don't fall in [a-Z] type, what regex should I should to match them. I can't use a (.*) because it would match everything.

linkpat = re.compile('name=[a-Z]+;size=[0-9]+')


Use unicode flag:

re.compile('name=\w+;size=\d+', re.U)

this would also match any letter in any language (plus underscore), not just Russian, though.


You can try \w with the correct LOCALE


Use character classes, which are locale dependent

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜