开发者

PHP regular expression to remove all javascript with exception

I'm looking for a way to remove all JavaScripts tags from a html string.

Following regex works fi开发者_开发问答ne, but I would like to add an exception:

$html = preg_replace('#<script[^>]*>.*?</script>#is', '', $html);

How can I add a rule that scripts of a type text/html are getting ignored?

<script type="text/html" ... > ... </script> 

Any suggestion?

Thanks in advance.


You may not be trying to sanitize untrusted HTML, but just so readers of this question don't get the wrong idea:

This won't remove javascript outside <script> elements : <img src=bogus onerror=alert(42)>.

It won't remove barely obfuscated scripts : <script>alert(42)</script >.

It will turn invalid content into scripts : <scrip<script></script>t>alert(42)</script>.

I'm not saying this is what you're trying to do. You may have perfectly good reasons for doing this that don't have to do with untrusted inputs, but, for later readers, don't try to roll your own HTML sanitizer with just regular expressions.


Use a greedy match that won't fall to Mike's pointers, like so:

$html = preg_replace('#<script.*</script>#is', '', $html);

This should (greedily) match all script tags. As for the exception, I'm not sure how to do that, sorry.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜