HeaderParseError in python

2023-03-17 06:58 问答作者：

I get a HeaderParseError if I try开发者_StackOverflow社区 to parse this string with decode_header() in python 2.6.5 (and 2.7). Here the repr() of the string:

 '=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?='

This string comes from a mime email which contains a JPEG picture. Thunderbird can decode the filename (which contains German umlauts).

>>> from email.header import decode_header
>>> decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
    raise HeaderParseError
email.errors.HeaderParseError

It seems an incompatibility between Python's character set for base64-encoded strings and the mail agent's:

>>> from email.header import decode_header
>>> a='QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw=='
>>> decode_header(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/email/header.py", line 108, in decode_header
    raise HeaderParseError
email.errors.HeaderParseError
>>> a1= a.replace('_', '/')
>>> decode_header(a1)
[('Anmeldung Netzanschluss S\xecdring3p.jpg', 'iso-8859-1')]
>>> print _[0][0].decode(_[0][1])
Anmeldung Netzanschluss Südring3p.jpg

Python utilizes the character set that the Wikipedia article suggests (i.e 0-9, A-Z, a-z, +, /). In that same article, some alternatives (including the underscore that's the issue here) are included; however, the underscore's value is vague (it's value 62 or 63, depending on the alternative).

I don't know what Python can do to guess the intentions of b0rken mail agents; so I suggest you do some appropriate guessing whenever decode_header fails.

I'm calling “broken” the mail agent because there is no need to escape either + or / in a message header: it's not a URL, so why not use the typical character set?

继续阅读：email-parsing python

HeaderParseError in python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？