开发者

How to strip tags in a safer way than using strip_tags function?

I'm having some problems using strip_tags PHP 开发者_如何转开发function when the string contains 'less than' and 'greater than' signs. For example:

If I do:

strip_tags("<span>some text <5ml and then >10ml some text </span>");

I'll get:

some text 10ml some text

But, obviously I want to get:

some text <5ml and then >10ml some text

Yes I know that I could use &lt; and &gt;, but I don't have chance to convert those characters into HTML entities since data is already stored as you can see in my example.

What I'm looking for is a clever way to parse HTML in order to get rid only actual HTML tags.

Since TinyMCE was used for generate that data, I know which actual html tags could be used in any case, so a strip_tags($string, $black_list) implementation would be more usefull than strip_tags($string, $allowable_tags).

Any thoughs?


As a wacky workaround you could filter non-html brackets with:

$html = preg_replace("# <(?![/a-z]) | (?<=\s)>(?![a-z]) #exi", "htmlentities('$0')", $html);

Apply strip_tags() afterwards. Note how this only works for your specific example and similar cases. It's a regular expression with some heuristics, not artificial intellegince to discern html tags from unescaped angle brackets with other meaning.


If you want to have "greater than" and "lesser than" signs, you need to escape them:

&gt; is >

&lt; is <

See e.g. this: http://www.w3schools.com/html/html_entities.asp


Instead of strip_tags(), just use htmlspecialchars() instead.

http://php.net/manual/en/function.htmlspecialchars.php


Following up on the accepted answer that uses a heuristic function to try to remove tags while sparing < and > signs, here is a version that uses preg_replace_callback, as the /e modifier in preg_replace is now deprecated:

function HTMLToString($string){
    return htmlspecialchars_decode(strip_tags(preg_replace_callback("# <(?![/a-z]) | (?<=\s)>(?![a-z]) #xi",    
        function ($matches){
            return (htmlentities($matches[0]));
        }
        , $string)));
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜