How to get a specific node text using php DOM
I am trying to get the value (text) of a specific node from an xml document using php DOM classes but I cannot do it right because I get the text content of that node merged with its descendants.
Let's suppose that I need to get the trees from this document:
<?xml version="1.0"?>
<trees>
LarchRedwoodChestnutBirch
<trimmed>Larch</开发者_JAVA百科trimmed>
<trimmed>Redwood</trimmed>
</trees>
And I get:
LarchRedwoodChestnutBirchLarchRedwood
You can see that I cannot remove the substring LarchRedwood made by the trimmed trees from the whole text because I would get only ChestnutBirch and it is not what I need.
Any suggest? (Thanx)
I got it. This works:
function specificNodeValue($node, $implode = true) {
$value = array();
if ($node->childNodes) {
for ($i = 0; $i < $node->childNodes->length; $i++) {
if (!(@$node->childNodes->item($i)->tagName)) {
$value[] = $node->childNodes->item($i)->nodeValue;
}
}
}
return (is_string($implode) ? implode($implode, $value) : ($implode === true ? implode($value) : $value));
}
A given node is like a root, if you get no tagName when you parse its child nodes then it is itself, so the value of that child node it is its own value.
Inside a bad formed xml document a node could have many pieces of value, put them all into an array to get the whole value of the node.
Use the function above to get needed node value without subnode values merged within.
Parameters are:
$node (required) must be a DOMElement object
$implode (optional) if you want to get a string (true by default) or an array (false) made up by many pieces of value. (Set a string instead of a boolean value if you wish to implode the array using a "glue" string).
You can try this to remove the trimmed
node
$doc = new DOMDocument('1.0', 'utf-8');
$doc->loadXML($xml);
$xpath = new DOMXpath($doc);
$trees = $doc->getElementsByTagName('trees')->item(0);
foreach ($xpath->query('/trees/*') as $node)
{
$trees->removeChild($node);
}
echo $trees->textContent;
echo $trees->nodeValue;
Use $node->nodeValue
to get a node's text content. If you use $node->textContent
, you get all text from the current node and all child nodes.
Ideally, the XML should be:
<?xml version="1.0"?>
<trees>
<tree>Larch</tree>
<tree>Redwood</tree>
<tree>Chestnut</tree>
<tree>Birch</tree>
</trees>
To split "LarchRedwoodChestnutBirch" into separate words (by capital letter), you'll need to use PHP's "PCRE" functions:
http://www.php.net/manual/en/book.pcre.php
'Hope that helps!
精彩评论