Groovy XmlSlurper vs XmlParser

2023-04-08 17:11 问答作者：

I searched for a while on this topic and found some results too, which I am mentioning at the end of post. Can someone help me precisely answer these three questions for the cases listed below them?

For which use-cases using XmlSluper makes more sense than XmlParser and vice-versa (from point of view ease of use of API/Syntax)?
Which one is more memory efficient? (looks like Slurper)
which one processes the xml faster?

Case a. when I have to read almost all nodes in the xml?

Case b. when I have to read just few nodes (like using gpath expression)?

Case c. when I have to update/transform the xml?

provided the xml document is not trivial one (with level of depths and size of the xml).

Resources :

http://www.tutkiun.com/2009/10/xmlparser-and-xmlslurper.html states :

Difference between XMLParser and XMLSlurper:

There are similarities between XMLParser and XMLSlurper when used for simple reading but when we use them for advanced reading and when processing XML documents in other formats there are differences between two.

XMLParser stores intermediate results after parsing documents. But on the other hand,

XMLSlurper does not stores internal results after processing XML documents.

The real, fundamental differences become appar开发者_开发知识库ent when processing the parsed information. That is when processing with direct in-place data manipulation and processing in a streaming scenario.

http://groovy.dzone.com/news/john-wilson-groovy-and-xml

The groovy doc (XmlParser, XmlSlurper) and the groovy's site explains them well (here and here)but does not do a great job in explaining the aforementioned question.

The big difference between XmlSlurper and XmlParser is that the Parser will create something similar to a DOM, while Slurper tries to create structures only if really needed and thus uses paths, that are lazily evaluated. For the user both can look extremely equal. The difference is more that the parser structure is evaluated only once, the slurper paths may be evaluated on demand. On demand can be read as "more memory efficient but slower" here. Ultimately it depends how many paths/requests you do. If you for example want only to know the value of an attribute in a certain part of the XML and then be done with it, XmlParser would still process all and execute your query on the quasi DOM. In that a lot of objects will be created, memory and CPU spend. XmlSlurper will not create the objects, thus save memory and CPU. If you need all parts of the document anyway, the slurper loses the advantage, since it will create at least as many objects as the parser would.

Both can do transforms on the document, but the slurper assumes it being a constant and thus you would have to first write the changes out and create a new slurper to read the new xml in. The parser supports seeing the changes right away.

So the answer to question (1), the use case, would be, that you use the parser if you have to process the whole XML, the slurper if only parts of it. API and syntax don't really play much a role in that. The Groovy people try to make those two very similar in user experience. Also you would prefer the parser over the slurper if you want to make incremental changes to the XML.

That intro above also explains then what is more memory efficient, question (2). The slurper is, unless you read in all anyway, then the parser may, but I don't have actual numbers about how big the difference is then.

Also question (3) can be answered by the intro. If you have multiple lazy evaluated paths, you have to eval again, then this can be slower than if you just navigate an existing graph like in the parser. So the parser can be faster, depending on your usage.

So I would say (3a) reading almost all nodes itself makes not much of a difference, since then the requests are the more determining factor. But in case (3b) I would say that the slurper is faster if you just have to read a few nodes, since it will not have to create a complete structure in memory, which in itself already costs time and memory.

As for (3c)...these days both can update/transform the XML. Which is faster is actually more linked to how many parts of the xml you have to change. If many parts I would say the parser, if not, then maybe the slurper. But if you want to for example change an attribute value from "Fred" to "John" with the slurper, just to later query for this "John" using the same slurper, it won't work.

I will give you crisp answers:

XML Parser is faster than XML Slurper.
XML Slurper consumes less memory than XML Parser.
XML Parser can parse and update the XML simultaneously.
For XML Slurper you need to MarkupBuild the XMLs after each update you make.
When you want to use path expressions XML Slurper would be better than parser.
For reading almost all nodes XML Parser would be fine.

Hope it helps

继续阅读：groovy xml xml-parsing xmlslurper

Groovy XmlSlurper vs XmlParser

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？