开发者

preg_replace div (or anything) with class=removeMe

just trying to rem开发者_开发知识库ove some elements with preg_replace but can't get it to work consistently. I would like to remove an element with matching class. Problem is the element may have an ID or several classes.

ie the element could be

<div id="me1" class="removeMe">remove me and my parent</div> 

or

<div id="me1" class="removeMe" style="display:none">remove me and my parent</div>

is it possible to do this?

any help appreciated! Dan.


I agree with MarcB. Overall, it's better to use a DOM when manipulating HTML. But here is a regex based on smottt's answer that might work:

$html = preg_replace('~<div([^>]*)(class\\s*=\\s*["\']removeMe["\'])([^>]*)>(.*?)</div>~i', '', $html);
  • Use [^>]* and [^<]* instead of .*. In my testing, .*? doesn't work. If a non-matching div comes before a matching div, it will match the first div, everything in between, and the last div. For example, it incorrectly matches against this entire string: <div></div><b>hello</b><div class="removeMe">bar</div>
  • Take into account the fact that you can use single quotes with HTML attributes.
  • Also remember that there can be whitespace around the equals sign.
  • You should use the "m" modifier too so that it takes line breaks into account (see this page).

I added parenthesis for clarity, but they aren't needed. Let me know if this works or not.

EDIT: Actually, nevermind, the "m" modifier won't do anything. EDIT2: Improved the regex, but it still fails if there are any newlines in the div.


While this is still doable with regular expression, it's much simpler with e.g. QueryPath:

print qp($html)->find(".removeMe")->parent()->remove()->writeHTML();


With preg_replace:

preg_replace('~<div([^>]*)class="(.*?)gallery(.*?)">(.*?)</div>~im', '', $html);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜