Ignore certain characters in quotes
I tried searching for the answer to this, but I couldn't find anything too helpful in this situation. It's possible that I'm not searching the correct terms.
I'm having trouble with this regex. Consider this string:
$str = "(1, 2, 'test (foo) bar'), (3, 4, '(hello,world)')";
I want to end up with a multidimensional array, like this:
$arr = array(
array(1, 2, 'test (foo) bar'),
array(3, 4, '(hello,world)')
);
I figure I could run a regex to split it up separate stri开发者_高级运维ngs like "(1, 2, 'test (foo) bar')" and "(3, 4, '(hello,world)')", and then run a regex on each of those to split by comma, but as you can see my problem is the data has parentheses and commas in various strings, and I'd like to ignore those.
So far I have this, which does the first part like I wanted, except if there are parentheses in the data, then it breaks.
preg_match_all('/\((.*?)\),?/', $str, $matches);
It gives me this:
Array
(
[0] => Array
(
[0] => (1, 2, 'test (foo)
[1] => (3, 4, '(hello,world)
)
[1] => Array
(
[0] => 1, 2, 'test (foo
[1] => 3, 4, '(hello,world
)
)
It truncates the data, naturally. What can I do to ignore the parentheses that are in quotes? If I can ignore them, then on the next step when I split each of these matches, I'll be able to ignore commas.
In general, you cannot do that with regexes. But in this case you can try this expression:
\(([^']*?'.*?')\),?
([0-9]+), (\'([A-Za-z0-9(), ]+)\')?
This appears to do what you want.
$matches Array:
(
[0] => Array
(
[0] => 1,
[1] => 2, 'test (foo) bar'
[2] => 3,
[3] => 4, '(hello,world)'
)
[1] => Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
)
[2] => Array
(
[0] =>
[1] => 'test (foo) bar'
[2] =>
[3] => '(hello,world)'
)
[3] => Array
(
[0] =>
[1] => test (foo) bar
[2] =>
[3] => (hello,world)
)
)
Is this closer?
Try this pattern:
$pattern = '/((?:.*?),(?:.*?),(?:.*?)),(.*)/';
this has the output
Array
(
[0] => Array
(
[0] => (1, 2, 'test (foo) bar'), (3, 4, '(hello,world)')
)
[1] => Array
(
[0] => (1, 2, 'test (foo) bar')
)
[2] => Array
(
[0] => (3, 4, '(hello,world)')
)
)
精彩评论