开发者

Problems matching caret in Python regex

开发者_运维百科

I have the following regular expression, which I think should match any character that is not alphanumeric, '!', '?', or '.'

re.compile('[^A-z ?!.]')

However, I get the following weird result in iPython:

In [21]: re.sub(a, ' ', 'Hey !$%^&*.#$%^&.')
Out[21]: 'Hey !  ^  .   ^ .'

The result is the same when I escape the '.' in the regular expression.

How do I match the caret so that it is removed from the string as well?


You have an error in your regular expression. Note that the case of the a and z is important. A-z includes all characters between ASCII value 65 (A) and 122 (Z), which includes the caret character (ASCII code 94).

Try this instead:

re.compile('[^A-Za-z ?!.]')

Example:

import re
regex = re.compile('[^A-Za-z ?!.]')
result = regex.sub(' ', 'Hey !$%^&*.#$%^&.')
print result

Result:

Hey !     .     .


The caret falls between the upper and lower cases in ASCII. You need [^a-zA-Z ?!\.]

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜