PHP xml parser is cutting urls inside node
Why does parser is cutting url to this:
inside node:
http://img844.imageshack.us/content.php?page=done&l=img844/4783/php4dd.jpg
after parse:
[done_page] => l=img844/8828/php4e8.jpg
private function _parse($result)
{
$XMLparser = xml_parser_create('UTF-8');
xml_set_element_handler(
$XMLparser,
Array($this, 'startElement'),
Array($this, 'endElement')
);
xml_set_character_data_handler($XMLparser, Array($this, 'stringElement'));
if (!xml_parse($XMLparser, $result)) {
echo '<br>XML Error: '.xml_error_string(xml_get_error_code($XMLparser));
echo ' at line '.xml_get_current_line_number($XMLparser);
exit();
}
print_r($this->parsed_results);
xml_parser_free($XMLparser);
}
public function stringElement($parser, $str)
{
if(strlen(trim($str)) > 0)
{
$this->parsed_results[$this->current_name] = $str;
}
}
public function startElement($parser, $name, $attributes)
{
$this->current_name = $name;
}
public function endElement($parser, $name)
{
}
<?xml version="1.0" encoding="iso-8859-1"?><links>
<image_link>http://img844.imageshack.us/img844/8828/php4e8.jpg</image_link>
<thumb_link>http://img844.imageshack.us/img844/8828/php4e8.th.jpg</thumb_link>
<ad_link>http://img844.imageshack.us/my.php?image=php4e8.jpg</ad_link>
<thumb_exists>yes</thumb_exists>
<tot开发者_JAVA技巧al_raters>0</total_raters>
<ave_rating>0.0</ave_rating>
<image_location>img844/8828/php4e8.jpg</image_location>
<thumb_location>img844/8828/php4e8.th.jpg</thumb_location>
<server>img844</server>
<image_name>php4e8.jpg</image_name>
<done_page>http://img844.imageshack.us/content.php?page=done&l=img844/8828/php4e8.jpg</done_page>
<resolution>468x458</resolution>
<filesize>118347</filesize>
<image_class>r</image_class>
</links>
The parser may have returned the text as two contiguous text nodes, resulting in stringElement
being called twice. The second call would have overwritten the text from the first text node. Try changing stringElement
so that it concatenates the input to any existing text and see if that returns the entire string.
On second examination, I'm pretty sure the parser returns the &
entity reference as a separate node, so you may have to reassemble all the text yourself. Depending on the parser implementation, the entity reference may be a different kind of node, so you will have to research what your particular parser does with entity references.
精彩评论