开发者

regex matches repeating group {0,2} or {0,4} but {0,3} doesn't

first, this is using preg.

String I'm trying to match:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp

My regex and their matches:

(\S*\s*){0,1}\S*p = "d xp"
(\S*\s*){0开发者_如何学Go,2}\S*p = "c d xp"
(\S*\s*){0,3}\S*p = NO MATCH (expecting "b c d xp"
(\S*\s*){0,4}\S*p = entire string
(\S*\s*){0,5}\S*p = entire string

Oddly, if I remove a single "a" it works. Also, (\S*\s*){0,3}\Sp or (\S*\s){0,3}\S*p both work.

Can someone explain why the third case results in no matches instead of "b c d xp"?

TIA!


Good question.

I tried another language that also has Perl RE syntax, Ruby, and it returned the expected string:

$ irb
>> s='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp'
=> "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp"
>> s[/(\S*\s*){0,3}\S*p/]
=> "b c d xp"

This made me think you found an interpreter bug...

But we now know that

  • Your RE was correct, as was your expectation of its results
  • PHP has a limit on backtracks, and the problem was your expression hit the limit. Ruby just doesn't check, or has a different limit.


preg_last_error() returns PREG_BACKTRACK_LIMIT_ERROR, so increasing backtrack limit should probably fix the issue. Try

 ini_set('pcre.backtrack_limit', 500000);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜