开发者

Regular expression issue with lookbehind and lookahead

I'm trying to create a regex that takes all the content from <div class="entrytext"> to the first </p> next to this div class.

At the moment this is what I have:

(?<=<div class="entrytext">.*<p>).*(?></p>)

Is going well cause all the code above this div is not matching, but the issue that I'm having is after this <div> there are a lots of </p> in the document.

What I would like is to take all the content next this div but until the first </p> found.

Could you give me a hand? 开发者_开发百科Thanks in advance.


  1. Most regex parsers don't allow for variable length lookbehinds
  2. You would need non-greedy operators (A ? after your *)
    (?<=<div class="entrytext">.*?<p>).*?(?></p>)
  3. Regex is (surprisingly for once) the tool for this job, but still look into html parsers, whatever you are doing that needs this probably would benefit from one.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜