Quick Python Regex Question: Matching negated sets of characters
I want to find strings that do NOT match a particular sequence of characters. For开发者_Go百科 example:
something like
REGEX = r'[^XY]*'
I'd like to look for strings that have any number of characters except an X and Y next to each other...the REGEX above doesn't work since it blocks X's and Y's separately.
How about:
if "XY" not in s:
print "matched"
else
print "not matched"
Or is this for inclusion in some longer regexp? Then maybe you want a negative lookahead expression:
REGEXP="...(?!XY)..."
EDIT: fixed typo
There are a few ways to do that.
^(?!.*XY).*$
The lookahead expression tries to match a XY
sequence anywhere in the string. It's a negative lookahead, so if it finds one, the match attempt fails. Otherwise the .*
goes ahead and consumes the whole string.
^(?:(?!XY).)*$
This one repeatedly matches any character (.
), but only after the lookahead confirms that the character is not the beginning of a XY
sequence.
^(?:[^X]+|X(?!Y))*$
Repeatedly matches one or more of any character except X
, or X
if it's not followed by Y
.
With the first two regexes, you have to apply the DOT_ALL modifier if their might be newlines in the source string. The third one doesn't need that because it uses a negated character class - [^X]
- instead of a dot.
精彩评论