开发者

How to stop BB Code manipulation (part two)?

I recently discovered an issue where people using BB Code to enter links are able to manipulate them.

They are meant to enter something like:

[LINK=http://www.domain.com]example text[/LINK]

However they can enter something like this to make the link color red:

[LINK=http://www.domain.com 'span style="color:red;"']example text[/LINK]

This is the code which converts it:

$text = preg_replace("/\[LINK\=(.*?)\](.*?)\[\/LINK\]/is", "<a href='$1' target='_blank'>$2</a>", $text);

Someon开发者_运维知识库e else was kind enough to provide a solution to a very similar problem but they want me to start a new question for this. Their solution just needs adapting. I have tried myself but I really can't get it to work. How to stop BB Code manipulation?


preg_replace_callback("/\\[LINK\=(.*?)\\\](.*?)\\[\/LINK\\]/is",
    function (array $matches) {
        if (filter_var($matches[1], FILTER_VALIDATE_URL))
            return '<a href="'.
                htmlspecialchars($matches[1], ENT_QUOTES).
                '" target="_blank">'.
                htmlspecialchars($matches[2])."</a>";
        else
            return "INVALID MARKUP";
    }, $text);

Use a callback to validate the URL and don't forget htmlspecialchars.


I think the simplest and best solution might be to run the url through htmlspecialchars() to escape the weird characters. That way it wouldn't get put directly into the source, but would be escaped first, so it wouldn't be able to hack out of the href="...".


Instead of using a regex replace, use a regex match to extract the information that you want, in this case the link and link text.

Then write that information out in the right format. That should eliminate the opportunity to get weird data into the output.

You can even double-check the variables before using them to make sure they don't contain any HTML.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜