Ruby Regexp: + vs *. special behaviour?
Using ruby regexp I get the following results:
>> 'foobar'[/o+/]
=> "oo"
>> 'foobar'[/o*/]
=> ""
But:
>> 'foobar'[/fo+/]
=> "foo"
>> 'foobar'[/fo*/]
=> "foo"
The documentation says:
*: zero or more repetitions of the preceding +: one or more repet开发者_如何学Goitions of the precedingSo i expect that 'foobar'[/o*/] returns the same result as 'foobar'[/o+/]
Does anybody have an explanation for that
'foobar'[/o*/]
is matching the zero o
s that appear before the f
, at position 0
'foobar'[/o+/]
can't match there because there needs to be at least 1 o
, so it instead matches all the o
s from position 1
Specifically, the matches you are seeing are
'foobar'[/o*/]
=>
'<>foobar'
'foobar'[/o+/]
=>
'f<oo>bar'
This is a common misunderstanding of how regexp works.
Although the * is greedy and isn't anchored at the start of the string, the regexp engine will still start looking from beginning of the string. In case of "/o+/", it does not match at position 0 (eg. "f"), but since the + means one or more, it has to continue matching (this has nothing to do with the greediness) until a match is found or all positions are evaluated.
However with the case of "/o*/", which as you know mean 0 or more times, when it doesn't match at position 0, the regexp engine will gracefully stop at that point (as it should, because o* simply means that the o is optional). There's also performance reasons, since "o" is optional, why spend more time looking for it?
精彩评论