Little help with regex
how can I match these:
(1, 'asd', 'asd开发者_如何学Python2')
but not match this:
(1, '(data)', 0)
I want to match the ( and ), but not match ( and ) inside ( and ).
Actually these are queries and I want to split them via preg_split.
/[\(*\)]+/
splits them, but also splits ( and ) inside them, how can I fix this?
Example:
The data is:
(1, 'user1', 1, 0, 0, 0)(2, 'user(2)', 1, 0, 0, 1)
I want to split them as:
Array(
0 => (1, 'user1', 1, 0, 0, 0)
1 => (2, 'user(2)', 1, 0, 0, 1)
);
instead of it, its splitted as:
Array(
0 => (1, 'user1', 1, 0, 0, 0)
1 => (2, 'user
2 => 2
3 => ', 1, 0, 0, 1)
);
A regex for this would be a little nasty. Instead, you can iterate over the entire string and decide where to split:
- If it's a
)
, split there. (I'm assuming the brackets are balanced in the string and can't be nested) - If it's a
'
, ignore any)
until a closing'
(If it can be escaped, you can look at the previous characters for an odd number of\
).
I think this is a more straight-forward solution than a regex.
You can't use preg_split
for that (as you don't match borders, but lengthier patterns). But it might be possible with a preg_match_all
:
preg_match_all(':\( ((?R) | .)*? \):x', $source, $matches);
print_r($matches[0]);
Instead of a ?R
recursive version, you could also just prepare the pattern for a single level of internal parenthesis. But that wouldn't look much simpler actually.
:\( ( [^()]* | \( [^()]* \) )+ \):x
Your grammar appears to be
list: '(' num ( ',' term )(s?) ')'
term: num | str
num: /[0-9]+/
str: /'[^']*'/
So the pattern is
/ \G \s* \( \s* [0-9]+ (?: \s* , \s* (?: [0-9]+ | '[^']*' ) )* \s* \) /x
Well, that's just for matching. Extraction is tricker if PHP works like Perl. If you want to do with with regex match, you have to do it in two passes.
First you extract the list:
/ \G \s* \( \s* ( [0-9]+ (?: \s* , \s* (?: [0-9]+ | '[^']*' ) )* ) \s* \) /x
Then you extract the terms from the list:
/ \G \s* ( [0-9]+ | '[^']*' ) (?: \s* , )? /x
精彩评论