开发者

Do 7bit and 8bit encoded messages have to be decoded before outputting?

What's the possible relation between 7bit transfer encoding and UTF-7, as well as between 8bit and UTF-8 ?

Does it make sense manually converting message body encoding to expected one (assume 'utf-8') as in the code below ?

 function decodeBody($body, $transferEncoding, $bodyEncoding) {

        switch ($transferEncoding) { 
开发者_如何学运维
            case '7BIT' :
            case '8BIT' :   
                                    // any additional decoding here ?
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;


            case 'BASE64' :
                $body = base64_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

            case 'QUOTED_PRINTABLE' :
                $body = quoted_printable_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

        }

        return $body;
    }


Citing the RFC1341:

The values "8bit", "7bit", and "binary" all imply that NO encoding has been performed... "8bit" means that the lines are short, but there may be non-ASCII characters (octets with the high-order bit set).

This means that 7bit is pure ascii and you don't need to convert it to UTF-8 at all (so no need to use mb_convert_encoding() in that case). '8bit' means that non-ascii characters might be present, but as far as I understand it need not necessarily be in UTF-8 charset encoding - might be iso-8859-1 or whatever as well. So AFAIK '8bit' doesn't mean UTF-8 automatically.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜