regex for parsing italic text?
Suppose I have the following text:
__This_is__ a __test__
Using two underscores for denoting italics. So I expect This_is
and test
to be italicized. The logic dictates that any text between two consecutive double underscores should be italicized, including any other number of underscor开发者_如何学Goes that may be there. I've got:
__([^_]+)__
What is the equivalent of "not two consecutive underscores" in group 1? Thanks.
An option would be to match two underscores:
__
Then make a negative look ahead to see if theres no two underscores ahead of the current position:
__(?!__)
if that is not the case, match any character:
__(?!__).
and repeat the previous one or more times:
__((?!__).)+
and finally match another two underscores:
__((?!__).)+__
which is the final solution.
A little demo:
<?php
$text = '__This_is__ a __test__';
preg_match_all('/__(?:(?!__).)+__/', $text, $matches);
print_r($matches);
?>
produces:
Array
(
[0] => Array
(
[0] => __This_is__
[1] => __test__
)
)
as can be seen on Ideone.
EDIT
Note that I used a non-capturing group in my demo, otherwise the output would have looked like this:
Array
(
[0] => Array
(
[0] => __This_is__
[1] => __test__
)
[1] => Array
(
[0] => s
[1] => t
)
)
i.e. the last character matched by ((?!__).)
would have been captured in group 1.
More about groups, see: http://www.regular-expressions.info/brackets.html
$text = '__This_is__ a __test__';
preg_match_all('/(__([\w]+)__)/', $text, $matches);
print_r($matches);
http://ideone.com/uHJCC
精彩评论