开发者

regex optional lookahead

I want a regular expression to match all of these:

  1. startabcend
  2. startdef
  3. blahstartghiend
  4. blahstartjklendsomething

and to 开发者_如何学运维return abc, def, ghi and jkl respectively.

I have this the following which works for case 1 and 3 but am having trouble making the lookahead optional.

(?<=start).*(?=end.*)

Edit:

Hmm. Bad example. In reality, the bit in the middle is not numeric, but is preceeded by a certain set of characters and optionally succeeded by it. I have updated the inputs and outputs as requested and added a 4th example in response to someones question.


If you're able to use lookahead,

(?<=start).*?(?=(?:end|$))

as suggested by stema below is probably the simplest way to get the entire pattern to match what you want.

Alternatively, if you're able to use capturing groups, you should just do that instead:

start(.*?)(?:end)?$

and then just get the value from the first capture group.


Maybe like this:

(?<=start).*?(?=(?:end|$))

This will match till "start" and "end" or till the end of line, additionally the quantifier has to be non greedy (.*?)

See it here on Regexr

Extended the example on Regexr to not only work with digits.


An optional lookahead doesn't make sense:

If it's optional then it's ok if it matches, but it's also ok if it doesn't match. And since a lookahead does not extend the match it has absolutely no effect.

So the syntax for an optional lookahead is the empty string.


Lookahead alone won't do the job. Try this:

(?<=start)(?:(?!end).)*

The lookbehind positions you after the word "start", then the rest of it consumes everything until (but not including) the next occurrence of "end".

Here's a demo on Ideone.com


if "end" is always going to be present, then use: (?<=start)(.*?)(?=end) as you put in the OP. Since you say "make the lookahead optional", then just run up until there's "end" or the carriage return. (?<=start)(.*?)(?=end|\n). If you don't care about capturing the "end" group, you can skip the lookahead and do (?:start)?(.*?)(?:end)? which will start after "start", if it's there and stop before "end", if it's there. You can also use more of those piped "or" patterns: (?:start|^) and (?:end|\n).


Why do you need lookahead?

start(\d+)\w*

See it on rubular

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜