cross-encoding XSL transformations
I have some operations to do on an XML files (nothing i开发者_如何学Cmportant) and XSL applies very well in this case. However, my input file is encoded in UTF-8 and the file after the transformation MUST be encoded in iso-8859-1. (I do not control the encoding of the input file either)
Everything goes well except that some special characters present in utf-8 and not in iso-8859 are escaped in the output file.
For instance I have
<text>some text with a € character</text>
transformed in
<text>some text with a € character</text>
The "€" is in the output file is an issue for me.
As we have to do something with those special characters which are not in ISO, I first thought of transforming them manually with the replace function: replace(., '€', 'euros') But there are just so many characters in utf-8 which are not in iso that it's quickly boring... and slow!
Do you have a better solution ? (assuming we could just remove those characters or transforming them to any viable iso character)
Thanks in advance
Do you have
<xsl:output encoding="iso-8859-1" />
in place?
Because that should be all you need, really. If your XSL processor does not correctly translate characters to the target encoding on its own, it is broken and you need to use a different one.
Hints
- Often
Windows-1252
is what people really mean when they sayISO-8859-1
. Check closely if that applies to you as well. There are subtle differences between the two (especially with regard to the Euro sign, which does not exist inISO-8859-1
, but does exist inWindows-1252
andISO-8859-15
). - Whenever an XML declaration
<?xml version="1.0" encoding="iso-8859-1"?>
is missing in an XML file, UTF-8 encoding is assumed. Be sure to put a declaration on top of your file whenever is not UTF-8 encoded.
精彩评论