Processing large xml files

2023-02-05 21:41 问答作者：

I am having a large xml file which contains many sub elements. I want to able to run some xpath queries. I tried using vtd-xml in java, but I get outofmemory error sometimes, because the xml is so large to fit into memory. 开发者_StackOverflow社区Is there an alternative way of processing such large xml's.

try http://code.google.com/p/jlibs/wiki/XMLDog

it executes xpaths using sax without creating in-memory representation of xml documents.

SAXParser is very efficient when working with large files

What are you trying to do right now? By the sounds of it you are trying to use a DOM based parser, which essentially loads the entire XML file into memory as a DOM representation. If you are dealing with a large file, you'll better off using a SAX parser, which processes the XML document in a streaming fashion.

I personally recommend StAX for this.

Did you use standard vtd or extended VTD-xml? If you use extended XML then you have the option of using memory mapping... did you try that?

Using XPath might not be a very good idea if you plan on compiling many expressions dynamically in a long lived application.

I'm not entirely sure how the java version of XPath works, but in .NET XPath compiles a dynamic assembly then adds it to the app domain. Subsequent uses of the expression look at the assembly now loaded into memory.
In one case, where I was using XPath it lead to a situation where I think, this same type of mechanism was slowing filling up memory similar to a memory leak.

My theory is that as each expression was compiled using values from the user, each compiled expressions was likely unique, so a new expression was compiled and added to the app domain.
Since you can remove the assembly from the app domain without restarting the entire app domain, memory was being consumed each time an expression was evaluated and it could not be recovered. As a result, the code was leaking memory in the form of assemblies in memory, and after a while, well you know the results.

继续阅读：out-of-memory xml

Processing large xml files

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？