Do 7bit and 8bit encoded messages have to be decoded before outputting?
What's the possible relation between 7bit transfer encoding and UTF-7, as well as between 8bit and UTF-8 ?
Does it make sense manually converting message body encoding to expected one (assume 'utf-8') as in the code below ?
function decodeBody($body, $transferEncoding, $bodyEncoding) {
switch ($transferEncoding) {
开发者_如何学运维
case '7BIT' :
case '8BIT' :
// any additional decoding here ?
$body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
break;
case 'BASE64' :
$body = base64_decode($body);
$body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
break;
case 'QUOTED_PRINTABLE' :
$body = quoted_printable_decode($body);
$body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
break;
}
return $body;
}
Citing the RFC1341:
The values "8bit", "7bit", and "binary" all imply that NO encoding has been performed... "8bit" means that the lines are short, but there may be non-ASCII characters (octets with the high-order bit set).
This means that 7bit is pure ascii and you don't need to convert it to UTF-8 at all (so no need to use mb_convert_encoding()
in that case). '8bit' means that non-ascii characters might be present, but as far as I understand it need not necessarily be in UTF-8 charset encoding - might be iso-8859-1 or whatever as well. So AFAIK '8bit' doesn't mean UTF-8 automatically.
精彩评论