开发者

Stripping script tags from HTML input

public static string MakeWebSafe(this string x) {
    const string RegexRemove = @"(<\s*script[^>]*>)|(<\s*/\s*script[^>]*>)";
    return Regex.Replace(x, RegexRemove, string.Empty, RegexOptions.IgnoreCase);
}

Is there any reason this implementation isn't good enough. Can you break it? Is there anything I haven't considered? If you use or have used something different, what are its advantages?

I'm aware this leaves the body of the script in the text, but that'开发者_如何学Gos okay for this project.

UPDATE

Don't do the above! I went with this in the end: HTML Agility Pack strip tags NOT IN whitelist.


Have you considered this kind of scenario??

<scri<script>pt type="text/javascript">
    causehavoc();
</scr</script>ipt>

The best thing to do is remove all tags, encode things, or use bbcode


Yes, your RegEx can be circumvented by unicode encoding the script tags. I would suggest you look to more robust libraries when it comes to security. Take a look at Microsoft Web Protection Library

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜