xml parsing with constant memory usage
i am trying to find xml parser with xpath support that uses small amount o开发者_如何学JAVAf memory , or rather constant amount of memory , i am trying to parse large xml files , like almost 1 Giga , i have been reading about xqilla , and it seems that is uses very large amount of memory because it is dom based, correct me if i'm wrong.. anyways , any idea for such xml parser for C++ & linux ?
If you can process the XML in essentially a single pass, a SAX parser would be a good idea. How about Apache Xerces C++?
Saxon-EE supports streaming of large XML documents using XSLT or XQuery (streaming is better supported in XSLT than in XQuery). Details at
- Streaming of Large Documents
You might look at
pugixml enables very fast, convenient and memory-efficient XML document processing. However, since pugixml has a DOM parser, it can't process XML documents that do not fit in memory; also the parser is a non-validating one, so if you need DTD/Schema validation, the library is not for you
However, it is explicitely not a streaming parser. I know streaming and xpath do not generally jive well (due to potential random-access requirements). Allthough, in .NET the ever-famous XPathReader seemed to have bridged the gap for a popular subset of XPath :)
精彩评论