开发者

how to convert a utf8 string to ascii string? [duplicate]

This question already has answers here: Closed 12 years ago.

Possible Duplicate:开发者_如何学Python

UTF-8 -> ASCII in C language

how to convert a utf8 string to ascii string ?


UTF-8 is a superset of ASCII. The character codes 0-127 (i.e. the ASCII characters) are directly mapped to the binary values 0-127. If you want to convert UTF-8 to ASCII, you can simply remove all bytes that are >= 128. This means that non-ASCII characters will be ignored in the converted string - if that is what you want.

Mind that for UTF-8 decoding, you need to detect characters that are encoded as multiple bytes. The number of bytes is the number of '1' bits left of the leftmost '0' bit, and this only applies to bytes >= 128. For example, 11000000 is the first byte of a character that was encoded to two bytes (it has two significant '1' bits). That means you also have to remove the following byte.

As the bytes that belong to a multi-byte-encoded character are always >= 128, you can just forget about the paragraph above :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜