How can I parse large XML files in PHP?
I am parsing through a XML file that's about 12开发者_高级运维mb big. I need to parse through the entire file and store what I find necessary in a MySQL database.
I am turning the XML file into an array. Then I parse through the array and store the values.
This works fine when the XML is really small, but it just stops behaving right when I run my 12mb file.
I tried multiple functions that convert XML to array that I found online and none of them work.
This is a common error I got with two different XML to array functions I found online:
Fatal error: [] operator not supported for strings
I am using SimpleXML, is there a better way of resolving this? Are there any libraries other than SimpleXML that are powerful enough to handle large XML files?
I have this now:
$z = new XMLReader;
$z->open('feedfetch.xml');
$doc = new DOMDocument;
while ($z->read() && $z->name !== 'collection');
while ($z->name === 'collection')
{
$node = simplexml_import_dom($doc->importNode($z->expand(), true));
var_dump($node[0]);
exit;
$z->next('collection');
}
Do you see my var dump? It echoes a bunch of XML objects, but I don't know how to get to the actual node with the data?
Switch from using SimpleXML to XMLReader when working with large XML files. This is a Pull parser that will not load the entire file into memory to process it.
SimpleXML is a good example of black-boxed code that does magic under the covers to make it look simpler that it is. In other words, don't do a var_dump()
of a SimpleXML object; you will get confused.
A XML file loaded into SimpleXML can be used look like nested objects and arrays of objects. You can reference nested elements with $dom->element->subelement
. Yes, it feels funny at first, but you will quickly get used to it. You do have to pay strict attention to your XML format, though, or you might be trying to access elements that don't exist. That's kind-of what your error is.
Unfortunatey, SimpleXML pulls the whole XML file into memory and parses it. This gives you the advantage of random access, but at the cost of taking up a lot of memory, perhaps unnecessarily. That said, 12Mb isn't beyond what SimpleXML is capable of and the error message you gave is not an out-of-memory error.
精彩评论