Strip Javascript on(whatever) events from Code using PHP
I want to strip out all JavaScript from a small snippet (4-6 lines) of HTML, I've read on here before开发者_运维技巧 that its best not to use REGEX on HTML, so if anybody knows a better way, please advise.
So for example i have the following code:
<a href="go/to/my/link" onclick="fetchMeSomeData(this)">My Link</a>
<p onfocus="doSomethingAmazing();"></p>
Now in PHP i want to replace the on(what ever event it is) event with just an empty space.
Use the HTML Purifier library to strip things like JavaScript and plugins from the code. It's much better then a blacklist-based regex approach because it uses a full HTML parser and a whitelist to clean the HTML.
I've build such regexp some time ago, looks a bit scary though :). Here is pure regexp, you might need to additionally mask special chars to match your language requirements.
(\son[a-z]+\s*=\s*"[^"\\\r\n]*(?:\\.[^"\\\r\n]*)*"(?=[^<]*?>))|(\son[a-z]+\s*=\s*'[^'\\\r\n]*(?:\\.[^'\\\r\n]*)*'(?=[^<]*?>))
Here is masked version (according to java standards), that you should be able to use as a string.
(\\son[a-z]+\\s*=\\s*\"[^\"\\\\\\r\\n]*(?:\\\\.[^\"\\\\\\r\\n]*)*\"(?=[^<]*?>))|(\\son[a-z]+\\s*=\\s*'[^'\\\\\\r\\n]*(?:\\\\.[^'\\\\\\r\\n]*)*'(?=[^<]*?>))
It looks only inside tags and takes into consideration masked quotes inside events. I'm sure it is not 100% bullet proof though.
精彩评论