开发者

How to protect ASP .NET web app from XSS while preserving entered data?

My colleagues and I have been debating how to best protect ourselves from XSS attacks but still preserve HTML characters that get entered into fields in our software.

To me, the ideal solution is to accept the data (turn off ASP .NET request validation) as the user enters it, throw it in the database exactly as they entered it. Then, whenever you display the data on the web, HTML-encode it. The problem with this approach is that there's a high likelihood that a developer somewhere someday will forget to HTML-encode the display of a value somewhere. Bam! XSS vulnerability.

Another solution that was proposed was to turn request validation off and strip out any HTML users enter before it is stored in the database using a regex. Devs will still have to HTML-encode things for display, but since you've stripped out any HTML tags, even if a dev forgets, we think it would be safe. The drawback to this is that users can't enter HTML tags into descriptions and fields and things, even if they explicitly want to, or they may accidentally paste in an email address surrounded by < > and the regex doesn't pick it up...whatever. It screws with the data, and it's not ideal.

The other issue we have to keep in mind is that the system has开发者_Python百科 been built in the fear of commitment to any one strategy around this. And at one point, some devs wrote some pages to HTML encode data before it gets entered into the database. So some data may be already HTML encoded in the database, some data is not - it's a mess. We can't really trust any data that comes from the database as safe for display in a browser.

My question is: What would be the ideal solution if you were building an ASP .NET web app from the ground up, and what would be a good approach for us, given our situation?


Assuming you go ahead and store the HTML directly in the database, in ASP.NET/MVC Razor, HTML-encoding is done automatically, so your negligent developer would have to really go above and beyond the call of duty to introduce the XSS. With standard webforms (or the webform view engine), you can force developers to use the <%: syntax, which will accomplish the same thing. (albeit with more risk that the developer will be negligent)

Furthermore, you could consider only selectively disabling request validation. Do you really need to support it for every request? The vast majority of requests, presumably, would not need to preserve (or allow) the HTML.


Using a regex to strip html is fairly easy to defeat and very difficult to get correct. If you want to clean HTML input it's better to use an actual parser to enforce strict XML compliance.

What I would do in this situation is store two fields in the database: clean and raw for the data. When the user wants to edit their content, you send them the raw data. When they submit changes, you sanitize it and store it in the clean field. Developers then only ever use the clean field when outputting the content to the page. I would even go so far as to name the raw field dangerousRawContent so it's obvious that care must be taken when referencing that field.

The added benefit of this technique is that you can re-sanitize the raw data with improved parsers at a later date without every loosing the originally intended content.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜