Extracting data form XML file with SimpleXML in PHP
Introduction:
I want to loop through XML files with flexible categories structure.
Problem:
I don't know to loop through a theoretical infinte subcategories without having to make x amount of "for each" statements (See coding example in the bottom). How do I dynamically traverse the categories structure?
<?xml version="1.0" encoding="utf-8"?>
<catalog>
<category name="Category - level 1">
<category name="Category - level 2" />
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
<category name="Category - 开发者_如何转开发level 2">
<category name="Category - level 3">
<category name="Category - level 4" />
</category>
</category>
</category>
</catalog>
What I have now:
I have no problem looping through XML files with a set structure:
<catalog>
<category name="Category - level 1">
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
</category>
</catalog>
Coding example:
//$xml holds the XML file
foreach ( $xml AS $category_level1 )
{
echo $category_level1['name'];
foreach ( $category_level1->category AS $category_level2 )
{
echo $category_level2['name'];
foreach ( $category_level2->category AS $category_level3 )
{
echo $category_level3['name'];
}
}
}
Getting the name attributes from your categories is likely fastest when done via XPath, e.g.
$categoryNames = $doc->xpath('//category/@name');
However, if you want to recursively iterate over an arbitrary nested XML structure, you can also use the SimpleXMLIterator, e.g. with $xml
being the string you gave:
$sxi = new RecursiveIteratorIterator(
new SimpleXMLIterator($xml),
RecursiveIteratorIterator::SELF_FIRST);
foreach($sxi as $node) {
echo str_repeat("\t", $sxi->getDepth()), // indenting
$node['name'], // getting attribute name
PHP_EOL; // line break
}
will give
Category - level 1
Category - level 2
Category - level 2
Category - level 3
Category - level 2
Category - level 3
Category - level 4
Like said in the beginning, when just wanting to get all name attributes, use XPath, because iterating over each and every node is slow. Use this approach only when you want to do more complex things with the nodes, for instance adding something to them.
<?php
$xml= new SimpleXMLElement('.....');
foreach ($xml->xpath('//category') as $cat)
{
echo $cat['name'];
}
A possible solution could be to write a recursive function, that would :
- Foreach category of the current depth
- write the name of the current category
- If it has any child catagories, call itself over those.
An advantage of such a solution is that you can keep track of the current depth you are, in your XML document -- can be useful if you need to represent your data as a tree, for instance.
For example, if you have your XML loaded like this :
$string = <<<XML
<catalog>
<category name="Category - level 1">
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
</category>
</catalog>
XML;
$xml = simplexml_load_string($string);
You could call the recursive function like this :
recurse_category($xml);
And that function could be written this way :
function recurse_category($categories, $depth = 0) {
foreach ($categories as $category) {
echo str_repeat(' ', 2*$depth);
echo (string)$category['name'];
echo '<br />';
if ($category->category) {
recurse_category($category->category, $depth + 1);
}
}
}
Finally, running this code would give your this kind of output :
Category - level 1
Category - level 2
Category - level 3
Category - level 2
Category - level 3
Using simplexml and xpath as fine
...but just as a sidenote, if all you want to achieve is to get the name attribute of each and every <category>
element in the document DOMDocument::getElementsByTagName() would suffice.
You can switch between DOM and simplexml via dom_import_simplexml() and simplexml_import_dom(). Both use the same internal representation of the data, so there's no costly conversion involved.
$xml = '<?xml version="1.0" encoding="utf-8"?>
<catalog>
<category name="Category - level 1">
<category name="Category - level 2" />
<category name="Category - level 2">
<category name="Category - level 3" />
</category>
<category name="Category - level 2">
<category name="Category - level 3">
<category name="Category - level 4" />
</category>
</category>
</category>
</catalog>';
$doc = new DOMDocument;
$doc->loadxml($xml);
foreach( $doc->getElementsByTagName('category') as $c) {
echo $c->getAttribute('name'), "\n";
}
prints
Category - level 1
Category - level 2
Category - level 2
Category - level 3
Category - level 2
Category - level 3
Category - level 4
精彩评论