开发者

issue with preg_replace

This is a simple one :)

I have this line which works great:

$listing['biz_description'] = preg_replace('/<!--.*?--\>/','',$listing['biz_description']);

What is the proper regex to remove the html enti开发者_如何学编程ty version?

This is the entities:

&lt;!-- --&gt;


I would just decode the html entities if you are happy with the preg_replace regex you already have... html_entity_decode As @ircmaxell mentioned, using regex for html parsing can be very painfull.

$str = "This is a <!-- test --> of the emergency &lt;!-- broadcast --&gt; system";
$str = preg_replace('/<!--.*?--\>', '' ,html_entity_decode($str));
echo $str;


NEVER use regex to parse HTML/XML...

An implementation with DomDocument (assuming valid xml):

$dom = new DomDocument();
$dom->loadXml($listing['biz_description']);
removeComments($dom);
$listing['biz_description'] = $dom->saveXml();

function removeComments(DomNode $node) {
    if ($node instanceof DomComment) {
        $node->parentNode->removeChild($node);
    } else {
        foreach ($node->childNodes as $child) {
            removeComments($child);
        }
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜