开发者

regex for parsing italic text?

Suppose I have the following text:

__This_is__ a __test__

Using two underscores for denoting italics. So I expect This_is and test to be italicized. The logic dictates that any text between two consecutive double underscores should be italicized, including any other number of underscor开发者_如何学Goes that may be there. I've got:

__([^_]+)__

What is the equivalent of "not two consecutive underscores" in group 1? Thanks.


An option would be to match two underscores:

__

Then make a negative look ahead to see if theres no two underscores ahead of the current position:

__(?!__)

if that is not the case, match any character:

__(?!__). 

and repeat the previous one or more times:

__((?!__).)+

and finally match another two underscores:

__((?!__).)+__

which is the final solution.

A little demo:

<?php
$text = '__This_is__ a __test__';
preg_match_all('/__(?:(?!__).)+__/', $text, $matches);
print_r($matches);
?>

produces:

Array
(
    [0] => Array
        (
            [0] => __This_is__
            [1] => __test__
        )

)

as can be seen on Ideone.

EDIT

Note that I used a non-capturing group in my demo, otherwise the output would have looked like this:

Array
(
    [0] => Array
        (
            [0] => __This_is__
            [1] => __test__
        )

    [1] => Array
        (
            [0] => s
            [1] => t
        )

)

i.e. the last character matched by ((?!__).) would have been captured in group 1.

More about groups, see: http://www.regular-expressions.info/brackets.html


$text = '__This_is__ a __test__';
preg_match_all('/(__([\w]+)__)/', $text, $matches);
print_r($matches);

http://ideone.com/uHJCC

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜