What encoding does Outlook use for plain text messages?
I need to decode e-mails saved from Outlook as Text Only. Unfortunately they're not in plain ISO-8859-1 since they contain special "smart quote" characters. Does the codepage used by Outlook have a real name (that I can pass to unicode.decode() in Python) or is it just some arbitrary made-up nonsense which I'll have to manually decode? And if so, does anyone have 开发者_如何学运维a reference for all the "special" characters Microsoft added?
It's quite likely that Outlook will save messages in your current locale. My guess would be Windows-1252.
Nitpick: What you call “smart quotes” is actually the way quotes are supposed to look. The quotes you've been using in your post are known as “typewriter quotes”; for mechanic typewriters, the number of keys was a major cost factor and quotes, which look very similar to one another, and the inch symbol were coalesced into a single key, aesthetics be damned.
There are many (locale-dependent) Windows code pages, so maybe worst-case it depends on the country in which the sender resides.
精彩评论