How to generate an exact copy of an XML document with resolved entities

2022-12-10 05:21 问答作者：

Given an XML document like this:

 <!DOCTYPE doc SYSTEM 'http://www.blabla.com/mydoc.dtd'>
 <author>john</author>
 <doc>
   <title>&title;</title>
 </doc>

I wanted to parse the above XML document and generate a copy of it with all of its entities already resolved. So given the above XMl document, the parser should output:

 <!DOCTYPE doc SYSTEM 'http://www.blabla.com/mydoc.dtd'>
 <author>john</author>
 <doc>
   <title>Stack Overflow Madness</title>
 </doc>

I know that you could implement an org.xml.sax.EntityResolver to resolve entities, but what I don't know is how to properly generate a copy of the XML document with everything still intact (except its entities). By everything, I mean the whitespaces, the dtd at the top of the document, the comments, and 开发者_JAVA技巧any other things except the entities that should have been resolved previously. If this is not possible, please suggest a way that at least can preserve most of the things (e.g. all but no comments).

Note also that I am restricted to the pure Java API provided by Sun, so no third party libraries can be used here.

Thanks very much!

EDIT: The above XML document is a much simplified version of its original document. The original one involves a very complex entity resolution using EntityResolver whose significance I have greatly reduced in this question. What I am really interested is how to produce an exact copy of the XML document with an XML parser that uses EntityResolver to resolve the entities.

You almost certainly cannot do this using any XML parser I've heard of, and certainly the Sun XML parsers cannot do it. They will happily discard details that have no significance as far as the meaning of the XML is concerned. For example,

<title>Stack Overflow Madness</title>

and

<title >Stack Overflow Madness</title >

are indistinguishable from the perspective of the XML syntax, and the Sun parsers (rightly) treat them as identical.

I think your choices are to do the replacement treating the XML as text (as @Wololo suggests) or relax your requirements.

By the way, you can probably use an XmlEntityResolver independently of the XML parser. Or create a class that does the same thing. This may mean that String.replace... is not the answer, but you should be able to implement an ad-hoc expander that iterates over the characters in a character buffer, expanding them into a second one.

Is it possible for you to read in the xml template as a string? And with the string do something like

string s = "<title>&title;</title>";
s = s.replace("&title;", "Stack Overflow Madness");
SaveXml(s);

继续阅读：xml xml-parsing

How to generate an exact copy of an XML document with resolved entities

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？