开发者

Silly RegEx Confusion

Well, I've been using regular expressions with good success for a while, but I've run into a snag.

I have two string patterns that I would like to distinguish:

AAA(CR)(LF)*  

vs

AAA BBBBB(CR)(LF)*

Where A is a letter, B could be any character except (CR)/(LF), and (CR)/(LF) are carriage-return and line-feed (i.e., 0h0D/0h0A).

I've tried the following pattern:

"[A-Z ]+.+\x0D\x0A\*"

But, aggravatingly, this matches both of the patterns above! Shouldn't the .+ prevent 开发者_如何学Pythonthe first pattern from being matched? As far as I understand, + is a greedy-match of one or more of the previous tokens... Where am I going wrong?

Thanks,

Brian


Your regex matches AAA(CR)(LF) because the first two characters match [A-Z ]+ and then the third A matches .+.

Although + indicates a greedy match, the regex engine will backtrack after finding AAA and discovering that the rest of the expression can't match. So it tries with AA and discovers that it can match the rest of the string.


Unless I misunderstood...

"[A-Z]+\x0D\x0A\*"

or

"[A-Z]+ .+\x0D\x0A\*"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜