开发者

Problem parsing XML with namespaces

hi i have xml file whitch i want to parse, it looks something like this

<?xml version="1.0" encoding="utf-8"?>
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">
    <SHOPITEM>
        <ID>2332</ID>
        ...
    </SHOPITE开发者_高级运维M>
    <SHOPITEM>
        <ID>4433</ID>
        ...
    </SHOPITEM>
</SHOP>

my parsing code is

from lxml import etree

ifile = open('sample-file.xml', 'r')
file_data = etree.parse(ifile)

for item in file_data.iter('SHOPITEM'):
   print item

but item is print only when xml container

<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">

looks like

<SHOP>

how can i parse xml document without worrying about this container definition?


See here for an explanation of how lxml.etree handles namespaces. In general, you should work with them rather than try to avoid them. In this case, write:

for item in file_data.iter('{http://www.w3.org/1999/xhtml}SHOPITEM'):

If you need to refer the namespace frequently, setup a local variable:

xhtml_ns = '{http://www.w3.org/1999/xhtml}'
...
for item in file_data.iter(xhtml_ns + 'SHOPITEM'):
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜