PHP Markdown XSS Sanitizer
I'm looking for a simple PHP library that helps filter XSS vulnerabilities in PHP Markdown output. I.E. PHP Markdown will parse things such as:
[XSS Vulnerability](javascript:alert('xss'))
I've been doing some reading around and the best I've found on the subject here was this question.
Although HTML Purifier looks like the best (nearly only) solution I was wondering if there was anything out there more general? HTML Purifier seems to be a bit robust especially for my needs, as well as a pain to configure, though it looks like it'd work excellent after doing so.
Is there anything else out there that may be a little less robust and configurable but still do a solid job? Or should I just dig in and start trying to configure HTML Purifier for my needs?
EDIT FOR CLARITY: I'm not looking to cut corners or anything of the like. HTML Purifier just offers a lot of fine grained control and for a simple small project that much control just simply isn't needed, though using nothing isn't an option either. This is where I was coming from when as开发者_Go百科king for something simpler or less robust.
Also a final note, I'm NOT looking for suggestions to use htmlspecialchars()
, strip_tags()
or anything of the like. I already disallow imbedded HTML in PHP Markdown Extra by sanitizing it in a similar fashion. I'm looking for ways to prevent XSS vulnerabilities in PHP Markdown OUTPUT.
Thanks.
I've never heard of any other tool than HTML Purifier, to do that -- and HTML Purifier does indeed have a good reputation.
Maybe it's "a bit robust" and "a pain to configure", yes ; but it's also probably the most used, and tested, solution available in PHP ;; and those are important criteria when you have to choose such an important component.
Even if it means investing half a day to configure it properly, if I were in your situation, I would probably choose HTML Purifier.
There is no such thing as too robust. “Sanitising” HTML is hard. Any corners you cut to process it more simply are likely to result in exploits sneaking through. Even complicated old HTMLPurifier, with its best-of-breed reputation, has had multiple ways of sneaking dangerous markup through in the past!
However, if your text-markup solution is capable of outputting dangerous HTML then it is deficient and should be replaced IMO. If PHP Markdown allows javascript:
URLs through then that's a pretty lamentable, basic flaw and I don't think I'd trust it to get anything else right.
I had a suggestion, and I asked on SO to find out if it would work but unfortunately, it was closed and marked as a duplicate to this question.
My suggestion is modifying markdown's code and allowing only links and image sources to start with http://
, https://
or ftp://
which covers all the common protocols required. If the link doesn't start with one of these, then it should be left unchanged in the output.
HTMLPurifier is a fine answer and perhaps the most robust solution.
It is also possible to use Markdown in a relatively safe way, but you have to use it in the right way. For details on how to use Markdown securely, look here. See the link for details about how to use it safely, but the short version is: it is important to use the latest version, to set safe_mode
, and to set enable_attributes=False
.
精彩评论