开发者

Special characters in xml encoding using dom and java?

I have some code to transform an Excel file to an XML one but when the cell's text contains some special characters, I'm unable to handle then correctly. For example: a cell contains texts like

(Destinataire de flux entrants ou Origine de flux so开发者_JAVA技巧rtants) **==>** trallla 

when tranforming it into xml, I get

(Destinataire de flux entrants ou Origine de flux sortants) **==&gt** trallla  

How can I get around of this problem?


You do not want '>' to be part of a value in a xml tag as it's a character that denotes the end of a tag. If it's substituted to &gt automatically than be happy it is. Your XML would become unusable otherwise. Typically any parsing of the XML afterwards will know how to handle the &gt part and re-substitute it.


You can also use CDATA. If this can help you solve your problem.


If you have problems reading esacaped HTML characters you can use Apache commons lang library which includes the method StringEscapeUtils.html.unescapeHtml(..).

The unescaped String is the input you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜