Problem parsing XML with namespaces
hi i have xml file whitch i want to parse, it looks something like this
<?xml version="1.0" encoding="utf-8"?>
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">
<SHOPITEM>
<ID>2332</ID>
...
</SHOPITE开发者_高级运维M>
<SHOPITEM>
<ID>4433</ID>
...
</SHOPITEM>
</SHOP>
my parsing code is
from lxml import etree
ifile = open('sample-file.xml', 'r')
file_data = etree.parse(ifile)
for item in file_data.iter('SHOPITEM'):
print item
but item is print only when xml container
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">
looks like
<SHOP>
how can i parse xml document without worrying about this container definition?
See here for an explanation of how lxml.etree handles namespaces. In general, you should work with them rather than try to avoid them. In this case, write:
for item in file_data.iter('{http://www.w3.org/1999/xhtml}SHOPITEM'):
If you need to refer the namespace frequently, setup a local variable:
xhtml_ns = '{http://www.w3.org/1999/xhtml}'
...
for item in file_data.iter(xhtml_ns + 'SHOPITEM'):
精彩评论