开发者

Regex for bbcode seems to fail on long sentences

I need some help with my BBCode replacing. Right now i'm doing the following to find and replace bbcode:

    $bbMatch[0] =   '/(\[b\])(.*)(\[\/b\])/';
    $bbReplace[0开发者_运维百科] = '<strong>${2}</strong>';

    $bbMatch[1] =   '/(\[url\])(.*)(\[\/url\])/';
    $bbReplace[1] = '[url=${2}]${2}[/url]';

    $bbMatch[2] =   '/(\[url=)(.+)(\])(.+)(\[\/url\])/';
    $bbReplace[2] = '<a href="${2}" target="_blank">${4}</a>';

    $bbMatch[3] =   '/(\[s\])(.*)(\[\/s\])/';
    $bbReplace[3] = '<span style="text-decoration: line-through;">${2}</span>';

    $bbMatch[4] =   '/(\[u\])(.*)(\[\/u\])/';
    $bbReplace[4] = '<span style="text-decoration: underline;">${2}</span>';

    $bbMatch[5] =   '/(\[i\])(.*)(\[\/i\])/';
    $bbReplace[5] = '<em>${2}</em>';
    // Remove bad characters
    $text = htmlspecialchars($text);

    // Parse Smilies
    $text = preg_replace($bbMatch, $bbReplace, $text);

The problem here is that when a large sentence is ran through this, it fails to find the end tag. It would show this is an example:

"Some text in italics[/i] with some words here [i]also text in italics

As you can see, it shows the end tag of the first one, and the begin tag of the second. How would i fix this?


You're problem is that regex is greedy by default. So it will grab everything between the first [i] and the last [/i]. Since you told it to grab wildcard characters in between those to sets of characters, and it tries to grab as many as it can it will gladly grab the [/i] and [i] as long as there is a surrounding [i]..[/i]. You just need to add a ? after the * to make it non greedy EX;

$bbMatch[5] =   '/(\[i\])(.*?)(\[\/i\])/';
$bbReplace[5] = '<em>${2}</em>';

You'll want to change all your regexes like that btw, not just your italics.

Here is an example of greedy vs. non-greedy regex: http://www.exampledepot.com/egs/java.util.regex/Greedy.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜