Purpose of (.+?) in regular expressions
I am kinda new to this R开发者_高级运维egex thing.
When analyzing some code I frequently come across the pattern .+?
or (.+?)
I can't seem to find the meaning of this pattern using my noobish deductive reasoning.
.
means any character (except a new line). +
means one or more. ?
in this context mean lazy or non-greedy. That means it will try to match the absolute minimum of characters that satisfy the quantifier. Example:
> 'abc'.match(/.+/)
["abc"]
> 'abc'.match(/.+?/)
["a"]
> 'abc'.match(/.*/)
["abc"]
> 'abc'.match(/.*?/)
[""]
It depends what kind of knowledge you have about patterns. Here's an explanation that assumes you have some kind of basic idea about what regular expressions are:
.
matches any character+
means repeat the last pattern 1 or more times- so far,
.+
means one or more characters ?
means ungreedy, which means the matching will stop with the first occasion.
A quick explanation on greediness:
/.+X/.exec("aaaXaaaXaaa");
["aaaXaaaX"]
/.+?X/.exec("aaaXaaaXaaa");
["aaaX"]
As you can see, the ?
character makes the search ungreedy, thus matching as little as possible.
精彩评论