开发者

how to pass an xml file to lxml to parse?

I'm trying to parse an xml file using lxml. xml.etree allowed me to simply pass the file name as a parameter to the parse function, so I attempted to do the same with lxml.

My code:

from lxml import etree
from lxml import objectify

file = "C:\Projects\python\cb.xml"
tree = etree.parse(file)

but I get the error:

Traceback (most recent call last):
  File "cb.py", line 5, in <module>
    tree = etree.parse(file)
  File "lxml.etree.pyx", line 2开发者_开发知识库698, in lxml.etree.parse (src/lxml/lxml.etree.c:4
9590)
  File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etre
e.c:71205)
  File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lx
ml.etree.c:71488)
  File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.e
tree.c:70583)
  File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/
lxml/lxml.etree.c:67736)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDo
c (src/lxml/lxml.etree.c:63820)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.e
tree.c:64741)
  File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etr
ee.c:64084)
lxml.etree.XMLSyntaxError: AttValue: " or ' expected, line 2, column 26

What am I doing wrong?


What you are doing wrong is (1) not checking whether you got the same outcome by using xml.etree on the same file (2) not reading the error message, which indicates a syntax error in line 2 of the file, way down stream from any file-opening issue


I stumbled across a similar error message this morning, and for me the answer was a malformed DTD. In my DTD, there was an Attribute definition with a default value that was not enclosed in quotes - as soon as I changed that, the error didn't happen anymore.


You have a syntax error in your XML Markup. You aren't doing anything wrong.


lxml allows you load a broken xml by creating a parser instance with recover=True

etree.XMLParser(recover=True)

While this is not ideal, I use this to load an xml for schema/dtd/schematron validation.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜