Regex for spliting on all unescaped semi-colons
I'm using php's preg_split to split up a string based on semi-colons, but I need it to only split on non-escaped semi-colons.
<?
$str = "abc;def\\;abc;def";
$arr = preg_split("/;/", $str);
print_r($arr);
?>
Produces:
Array
(
[0] =>开发者_Python百科; abc
[1] => def\
[2] => abc
[3] => def
)
When I want it to produce:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
I've tried "/(^\\)?;/"
or "/[^\\]?;/"
but they both produce errors. Any ideas?
This works.
<?
$str = "abc;def\;abc;def";
$arr = preg_split('/(?<!\\\);/', $str);
print_r($arr);
?>
It outputs:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
You need to make use of a negative lookbehind (read about lookarounds). Think of "match all ';' unless preceed by a '\'".
I am not really proficient with PHP regexes, but try this one:
/(?<!\\);/
Since Bart asks: Of course you can also use regex to split on unescaped ; and take escaped escape characters into account. It just gets a bit messy:
<?
$str = "abc;def\;abc\\\\;def";
preg_match_all('/((?:[^\\\\;]|\\\.)*)(?:;|$)/', $str, $arr);
print_r($arr);
?>
Array
(
[0] => Array
(
[0] => abc;
[1] => def\;abc\\;
[2] => def
)
[1] => Array
(
[0] => abc
[1] => def\;abc\\
[2] => def
)
)
What this does is to take a regular expression for “(any character except \ and ;) or (\ followed by any character)” and allow any number of those, followed by a ; or the end of the string.
I'm not sure how php handles $ and end-of-line characters within a string, you may need to set some regex options to get exactly what you want for those.
精彩评论