Lazy regex with OR
I have strings of the form:
greengrocerabc
sandwichcba
oscardcba
I'd like to lazily match开发者_C百科 abc
, abcd
and abcde
, so I can get the first component. So something like
sub("^(.+)(abc|cba|dcba)", "\\1", "oscardcba") => "oscar"
However, regexp OR
greedy matches cba
and I get oscard
How can I lazy match this OR
? The language is R, but it can can act like grep or perl.
Then make the quantifier non-greedy:
^(.+?)(abc|cba|dcba)
This way, the capture group will only contain the shortest possible match (which will not include abc
, cba
or dcba
).
Further reading:
- http://www.regular-expressions.info/
精彩评论