Widechar to Bytes using bits pattern?
If the number of bytes in UTF-8 encoded wide char is known, would it be possible get bytes using the following method?
For example:
Wide character ¿
code 191
to bytes gives -62
and -65
I've tried to fit the 8 bits开发者_如何学运维 in 191
into the slots but didn't get the same result
110[0][0][0][1][0] 10[1][1][1][1][1][1]
127 255
First, don't convert to signed bytes. That just confuses matters. So code point 191 yields the byte sequence 194 191
Decimal: 194 191
Binary: 110[0][0][0][1][0] 10[1][1][1][1][1][1]
To generate these bytes, you start from the right edge. You get six bits from the 191 and two more from the 194, with an additional three bits leftover, yielding:
Binary: 00000[0][0][0] [1][0][1][1][1][1][1][1]
Decimal: 0 191
Wikipedia has a surprisingly good writeup on how this all works.
精彩评论