开发者

Parsing different language characters and accents into valid XML

I have a bunch of XML data with different language data, which has accents. Example:-

<text content="vídeo..." /> or <text content="vidéo..." />

T开发者_如何学Pythonhis data is coming from MySQL - I'm then assembling the data with SimpleXML - which just refuses to even put the data in when these chars are in the content.

Tried (as someone suggested) using utf8_encode() on the data before hand, just to see if that helped.

Am I missing something obvious?


Welcome to character encoding. First you have to make sure you use encoding that matches wherever your XML is used. The encoding you use to add the data has to be the same in your XML file. If it is just for your environment you can use the encoding that works best for you but if you need it to work around the globe UTF-8 is your best bet.

If you have characters that are not known in your encoding you have to encode your strings into character references. If you do that with entity references and what htmlentities() does you will have to add some DTD with the entity references to your XML file because XML does only know about handful of defaults. If you need some DTDs you can download them here. If you cannot use a DTD you have to use numeric references in your XML file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜