Encoding with ZipFile Python
I'm trying to generate an .xml
file using the library ZipFile
in python 3.9
.
"Invoice n°123456789"
import xml.etree.El开发者_运维百科ementTree as ET
from zipfile import ZipFile
def gen_zip():
# Init xml element
root = ET.Element("MyXml")
# Fill SubElement
batch = ET.SubElement(
root,
"Batch",
Id="1",
DateTime=datetime.strftime(datetime.now(), "%Y%m%d%H%M%S"),
Value = "Invoice n°123456789"
)
zf = ZipFile("myzip.zip", "w")
# Write xml to zip
zf.writestr("file.xml", ET.tostring(root))
zf.close()
return None
The problem i have is that the content of my file file.xml
in myzip.zip
is the following :
<MyXml>
<Batch Id="1" DateTime="20221207093739" Value="Invoice n°123456789"/>
</MyXml>
I don't know why the °
symbol becomes °
. As per the doc the writestr
function does utf-8
encoding and not unicode
. (https://docs.python.org/3/library/zipfile.html)
I'd like the content of my .xml
to be Invoice n°123456789
i/o Invoice n°123456789
Add the correct encoding as a parameter when you call tostring
.
zf.writestr("file.xml", ET.tostring(root, encoding='unicode'))
To include special characters in an XML file using the ElementTree library without them being escaped, you can use the escape method from the xml.sax.saxutils module, like this:
Here's how you could modify your code to use a CDATA section:
import xml.etree.ElementTree as ET
import xml.sax.saxutils
from zipfile import ZipFile
def gen_zip():
# Init xml element
root = ET.Element("MyXml")
# Escape the special characters in the invoice value
value = xml.sax.saxutils.escape("Invoice n°123456789")
# Fill SubElement with the escaped value
batch = ET.SubElement(
root,
"Batch",
Id="1",
DateTime=datetime.strftime(datetime.now(), "%Y%m%d%H%M%S"),
Value=value
)
zf = ZipFile("myzip.zip", "w")
# Write xml to zip
zf.writestr("file.xml", ET.tostring(root))
zf.close()
return None
This should produce an XML file with the Invoice n°123456789 value, without the special characters being escaped.
精彩评论