PHP BBcode using preg_replace() - Prevent users from entering onClick, onKeyPress
I have a simple question (not for me), ok, at first, please take a look at this:
$msg=preg_replace("/\[b(.*?)\](.*?)\[\/b\]/i", "<b $1>$2</b>", $msg);
Okay, on that regEXP, a $msg will replace any thing found and put it into a new form (I don't know how to explain, how about an example):
It will turn
[b]TEXT[/b]
into
<b>TEXT</b>
Or it will turn
[b style="color: red;" title="HELLO"]TEXT[/b]
into
<b style="color: red;" title="HELLO">TEXT</b>
Here is where the problem springs from, what happen if it turns:
[b onclick="SOME TROJAN SCRIPT"]TEXT[/b]
into
<b onclick="SOME TROJAN SCRIPT">TEXT</b>
And all I want to do is instead of replace all attributes go after [b attribute1 attribute2...attributeN], the function will remain those attributes AS LONG AS THEY DO NOT START WITH on (like o开发者_如何学JAVAnClick, onMouseOver...).
I appreciate for any suggestion ^^! Thank you guys in advanced...
PECL offers a BBCode package. Also PEAR package eqiv, if you can't install PECL packages. Will make working with BBCode's much easier for you... once you work it out.
Regex is rarely ever the right tool to stop HTML/JavaScript related security issues.
Use a HTML parser.
This will be far easier to whitelist than blacklist, particularly because of the myriad of ways malicious users can obfuscate the javascript. I would make a list of acceptable entries and work from there instead. Yes, I realize that they could technically have any css entry there, but (1) you're the one who wants to allow users to create their own HTML, practically inviting all sorts of XSS headaches, and (2) this is only a <b>
tag, so you should be OK with a small subset of allowable css commands.
Your playing with fire but this should cure your immediate problem:
s/\[b(\s*|\s+(?:(?!(?<=\s)on..*?\s*=\s*['"]).)*?)\](.*?)\[\/b\]/<b$1>$2<\/b>/xi
or rx = /\[b(\s*|\s+(?:(?!(?<=\s)on..*?\s*=\s*['"]).)*?)\](.*?)\[\/b\]/
and replacement = <b$1>$2<\/b>
and some other subtle fixes.
EDIT A test case for sample [b onclick="alert('HELLO');"]HELLO[/b]
use strict;
use warnings;
my @samps = (
'[b]TEXT[/b]',
'[b on="]TEXT[/b]',
'[b styleon="color: red;" title="HELLO"]TE
XT[/b]',
'[b onclick="SOME TROJAN SCRIPT"]TEXT[/b]',
'[b onclick="alert(\'HELLO\');"]HELLO[/b]',
);
for (@samps) {
print "Testing $_\n";
if ( s/\[b(\s*|\s+(?:(?!(?<=\s)on..*?\s*=\s*['"]).)*?)\](.*?)\[\/b\]/<b$1>$2<\/b>/si ) {
print " .. passed $_\n";
}
else {
print " .. failed\n";
}
}
Output
Testing [b]TEXT[/b]
.. passed <b>TEXT</b>
Testing [b on="]TEXT[/b]
.. passed <b on=">TEXT</b>
Testing [b styleon="color: red;" title="HELLO"]TE
XT[/b]
.. passed <b styleon="color: red;" title="HELLO">TE
XT</b>
Testing [b onclick="SOME TROJAN SCRIPT"]TEXT[/b]
.. failed
Testing [b onclick="alert('HELLO');"]HELLO[/b]
.. failed
精彩评论