IOError with lxml etree parse function
I have a logic like :
for root, dirs, files in os.walk(os.getcwd()):
if "info.xml" in files:
root = lxml.etree.parse("%s/info.xml" % root)
tag = root.xpath("/info/tagname")[0].text
when parse one info.xml
which very deep in current path, met Error Message:
Traceback (most recent call last):
File "/home/work/merge开发者_JAVA技巧file.py", line 365, in <module>
File "/home/work/mergefile.py", line 344, in merge_ejb_files
File "/home/work/mergefile.py", line 63, in __init__
File "/home/work/mergefile.py", line 78, in _parse_info2doc
File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71205)
File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:71488)
File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:70583)
File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:67736)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
File "parser.pxi", line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056)
IOError: Error reading file '/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml': failed to load external entity "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"
but the file "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"
exist and I can parse it with lxml under ipython IDE
Do you know what is the problem is? If you know it, help me please! Thank you!
Here's my solution, as per my comment above. I'm opening files for read, them closing them right after so I don't hit the 1024 file limit.
import lxml.etree as etree
for root,dirs,files in os.walk(os.getcwd()):
if "info.xml" in files:
with open('%s/info.xml'%root) as processfile: #use 'rb' if necessary
xml = etree.parse(processfile)
tag = root.xpath("/info/tagname")[0].text
精彩评论