XML parsing and usage

2022-12-12 23:03 问答作者：

I'm building a conforming and validating XML parser in C++ and trying to make it light-weight for use in pocket pc.

At the beginning I decided to add some "events" to my parser like SAX does, informing about elements, processing instructions, etc.

This events are taken by a derived class that builds the DOM tree of the xml.

My doubts appears when trying to handle mainly entities (which can contain elements, pi's and comments inside if defined) and their resolution.

For e.g., I can create a XMLEntityRef class that refers to some XMLEntity defined in some XMLDocType object like .NET system.xml parser does.

As I know, for most purposes an application needs to know an element, its contents, its respective attributes and their respective values... only strings... it doesn't care if the element content is formed by cdata objects, entity references and/or plain text... 开发者_JS百科the same applies to attribute values.

So, my question is the following: What is the benefit of passing to an application each xml object as it appears and letting it (or a helper class) to build, for e.g., the resulting attribute's value by concatenating texts and resolved entity references?

If i'm making a poll, please answer: does your application need to know about cdata tags and where they are located in the xml file, or you make things easy... you want to know the full content value of an element in a string without worrying about how it is builded?

Best regards, Mauro H. Leggieri

I'm building a conforming and validating XML parser in C++ and trying to make it light-weight

There is no such thing as a light-weight conforming (never mind validating) parser. To be a conforming parser you have to understand all the stuff that can go in a DTD external subset, which is gnarly work indeed. It is a shame that the XML specification ended up weighed down with all the SGML DTD crud, but we are stuck with it now.

does your application need to know about cdata tags and where they are located in the xml file

Normally no. DOM Level 3 LS does require that CDATA sections be kept a CDATASection nodes in the DOM by default, but almost no application cares.

(If the question is about my application then yes, because my application is a templating system that keeps CDATA sections where they were. But still.)

My doubts appears when trying to handle mainly entities

God yes. Entity references are a total disaster. Making a DOM implementation support them in a way which is compliant with DOM Level 3 Core/LS is very very complicated. Avoid if at all possible.

generally xml is not light weight. You are better off with JSON.

When building a parser I do not think you should presume anything about how applications will consume the xml, rather, provide the most granular level of data for each xml node to provide maximum flexibility. While this may require more work on the part of consuming applications, they will be able to accomplish whatever they need to. Good luck.

继续阅读：parsing xml xml-parsing

XML parsing and usage

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？