开发者

strip_tags: strip off the messy tags and styles

How can I strip off certain html tags and allow some of them?

For instance,

I want to strip off span tags but allow the span with underline.

<span style="text-decoration: underline;">Text</span>

I want to allow p but I want to remove any styles or classes inside the p for instance,

<p class="99light">Text</p> the class inside the p tag should be removed 开发者_Go百科- I just want a clean p tag.

The is the line I have so far,

strip_tags($content, '<p><a><br><em><strong><ul><li>');


You can't. You'll need to use an XML/HTML parser to do that:

// with DOMDocument it might look something like this.
$dom = new DOMDocument();
$dom->loadHTML( $content );
foreach( $dom->getElementsByTagName( "p" ) as $p )
{
    // removes all attributes from a p tag.
    /*
    foreach( $p->attributes as $attrib )
    {
        $p->removeAttributeNode( $attrib );
    }
    */
    // remove only the style attribute.
    $p->removeAttributeNode( $p->getAttributeNode( "style" ) );
}
echo $dom->saveHTML();


You need full DOM parsing. strip_tags will not offer the necessary security and customization. I have used the HTMLPurifier library in the past for this. It does actual parsing and allows you to set whitelists while taking care of malicious inputs and producing valid markup!

By "necessary security" I mean that if you try to write a custom parser you will make a mistake (don't worry, I would too) and by "customization" I mean no built-in solution will let you target only certain tags with certain attributes and values of those attributes. HTMLPurifier is the PHP library solution.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜