开发者

Widechar to Bytes using bits pattern?

If the number of bytes in UTF-8 encoded wide char is known, would it be possible get bytes using the following method?

For example:

Wide character ¿ code 191 to bytes gives -62 and -65

I've tried to fit the 8 bits开发者_如何学运维 in 191 into the slots but didn't get the same result

110[0][0][0][1][0]   10[1][1][1][1][1][1]

      127                   255


First, don't convert to signed bytes. That just confuses matters. So code point 191 yields the byte sequence 194 191

Decimal: 194                   191
Binary:  110[0][0][0][1][0]    10[1][1][1][1][1][1]

To generate these bytes, you start from the right edge. You get six bits from the 191 and two more from the 194, with an additional three bits leftover, yielding:

Binary:  00000[0][0][0]    [1][0][1][1][1][1][1][1]
Decimal: 0                 191

Wikipedia has a surprisingly good writeup on how this all works.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜