开发者

Conditional Regular Expressions

I'm using Python and I want to use regular expressions to check if something "is part of an include list" but "is not part of an exclude list".

My include list is represented by a regex, for example:

And.*

Everything which starts with And.

Also the exclude list is represented by a regex, for example:

(?!Andrea)

Everything, but not the string Andrea. The exclude list is obviously a negation.

Using the two examples above, for example, I want to match everything which starts with And except for Andrea.

In the general case I have an includeRegEx and an excludeRegEx. I want to match everything which matchs includeRegEx but not matchs excludeRegEx. Attention: excludeRegEx is still in the negative form (as you can see in the example above), so it should be better to say: if something matches includeRegEx, I check if it also matches excludeRegEx, if it does, the match is satisfied. Is it possible to represent this in a single reg开发者_开发知识库ular expression?

I think Conditional Regular Expressions could be the solution but I'm not really sure of that.

I'd like to see a working example in Python.

Thank you very much.


Why not put both in one regex?

And(?!rea$).*

Since the lookahead only "looks ahead" without consuming any characters, this works just fine (well, this is the whole point of lookaround, actually).

So, in Python:

if re.match(r"And(?!rea$).*", subject):
    # Successful match 
    # Note that re.match always anchor the match
    # to the start of the string.
else:
    # Match attempt failed

From the wording of your question, I'm not sure if you're starting with two already finished lists of "match/don't match" pairs. In that case, you could simply combine them automatically by concatenating the regexes. This works just as well but is uglier:

(?!Andrea$)And.*

In general, then:

(?!excludeRegex$)includeRegex
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜