开发者

Encoding with ZipFile Python

I'm trying to generate an .xml file using the library ZipFile in python 3.9.

The content i'm trying to add is the following : "Invoice n°123456789"

import xml.etree.El开发者_运维百科ementTree as ET
from zipfile import ZipFile

def gen_zip():

  # Init xml element
  root = ET.Element("MyXml")
  # Fill SubElement
  batch = ET.SubElement(
      root,
      "Batch",
      Id="1",
      DateTime=datetime.strftime(datetime.now(), "%Y%m%d%H%M%S"),
      Value = "Invoice n°123456789"
  )

  zf = ZipFile("myzip.zip", "w")
  # Write xml to zip
  zf.writestr("file.xml", ET.tostring(root))
  zf.close()

  return None

The problem i have is that the content of my file file.xml in myzip.zip is the following :

<MyXml>
  <Batch Id="1" DateTime="20221207093739" Value="Invoice n&#176;123456789"/>
</MyXml>

I don't know why the ° symbol becomes &#176;. As per the doc the writestr function does utf-8 encoding and not unicode. (https://docs.python.org/3/library/zipfile.html)

I'd like the content of my .xml to be Invoice n°123456789 i/o Invoice n&#176;123456789


Add the correct encoding as a parameter when you call tostring.

zf.writestr("file.xml", ET.tostring(root, encoding='unicode'))


To include special characters in an XML file using the ElementTree library without them being escaped, you can use the escape method from the xml.sax.saxutils module, like this:

Here's how you could modify your code to use a CDATA section:

import xml.etree.ElementTree as ET
import xml.sax.saxutils
from zipfile import ZipFile

def gen_zip():
  # Init xml element
  root = ET.Element("MyXml")
  # Escape the special characters in the invoice value
  value = xml.sax.saxutils.escape("Invoice n°123456789")
  # Fill SubElement with the escaped value
  batch = ET.SubElement(
      root,
      "Batch",
      Id="1",
      DateTime=datetime.strftime(datetime.now(), "%Y%m%d%H%M%S"),
      Value=value
  )

  zf = ZipFile("myzip.zip", "w")
  # Write xml to zip
  zf.writestr("file.xml", ET.tostring(root))
  zf.close()

  return None


This should produce an XML file with the Invoice n°123456789 value, without the special characters being escaped.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜