Help with regex pulling XML data from response body in PHP
I am working on a project that pulls data from JMS queue using PHP and Zend Framework. The HTTP client response is below. All I need is the XML string.
I came up with /(.*)</RequestDetails>/gs which 开发者_Python百科tests ok on http://gskinner.com/RegExr/ but the preg_match call is returning an empty matches array.
I'm going to continue to hunt around for a pattern, but thought I would post here as well.
Thanks to all who read, etc...
Steve
UPDATE: I can't get the code to paste correctly. Here's a link to a pastbin: http://pastebin.com/rQxzcfSg
The following snippet:
<?php
$text = <<<EOT
blah blah <0>
<RequestDetails><1><2><3>test</RequestDetails>
<RequestDetails><4><5><6>blah
more blah blah
</RequestDetails>
blah blah <7>
EOT;
print $text;
preg_match_all('/<RequestDetails>(.*?)<\/RequestDetails>/s', $text, $matches);
print_r($matches);
?>
Generates this output:
blah blah <0>
<RequestDetails><1><2><3>test</RequestDetails>
<RequestDetails><4><5><6>blah
more blah blah
</RequestDetails>
blah blah <7>
Array
(
[0] => Array
(
[0] => <RequestDetails><1><2><3>test</RequestDetails>
[1] => <RequestDetails><4><5><6>blah
more blah blah
</RequestDetails>
)
[1] => Array
(
[0] => <1><2><3>test
[1] => <4><5><6>blah
more blah blah
)
)
I've used preg_match_all
instead of /g
flag, and also used (.*?)
reluctant matching, which is really what you want to get multiple matches.
To see why it makes a difference, in the following text, there are two A.*?Z
matches, but only one A.*Z
.
---A--Z---A--Z----
^^^^^^^^^^^
A.*Z
That said, parsing XML using regex is ill-advised. Use a proper XML parser; it'll make your life much easier.
I'd say, why bother with complex Regexes when PHP 5 comes with on-board tools like SimpleXML?
$xml = simplexml_load_string($string);
print_r($xml); // should output complete tree for you to walk through easily
You'd just have to remove the MIME parts and submit only the raw XML to the function, of course.
More on SimpleXML here.
Your g
is invalid. Use m
instead (for multiline). Test /(.*)<\/RequestDetails>/gs
and /(.*)<\/RequestDetails>/ms
using this tester.
精彩评论