开发者

Purpose of (.+?) in regular expressions

I am kinda new to this R开发者_高级运维egex thing.

When analyzing some code I frequently come across the pattern .+? or (.+?)

I can't seem to find the meaning of this pattern using my noobish deductive reasoning.


. means any character (except a new line). + means one or more. ? in this context mean lazy or non-greedy. That means it will try to match the absolute minimum of characters that satisfy the quantifier. Example:

> 'abc'.match(/.+/)
["abc"]
> 'abc'.match(/.+?/)
["a"]
> 'abc'.match(/.*/)
["abc"]
> 'abc'.match(/.*?/)
[""]


It depends what kind of knowledge you have about patterns. Here's an explanation that assumes you have some kind of basic idea about what regular expressions are:

  • . matches any character
  • + means repeat the last pattern 1 or more times
  • so far, .+ means one or more characters
  • ? means ungreedy, which means the matching will stop with the first occasion.

A quick explanation on greediness:

/.+X/.exec("aaaXaaaXaaa");
["aaaXaaaX"]
/.+?X/.exec("aaaXaaaXaaa");
["aaaX"]

As you can see, the ? character makes the search ungreedy, thus matching as little as possible.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜