Java UTF-16 Encoding code

2023-03-03 03:27 问答作者：

The function that encodes a Unicode Code Point (Integer) to a char array (Bytes) in java is basically this:

return new char[] { (char) codePoint };

Which is just a cast from the integer value to a char.

I would like to know how this cast is actually done, the code behind that cast to make the conversion from an integer value to a character enc开发者_JAVA技巧oded in UTF-16. I tried looking for it on the java source codes but with no luck.

I'm not sure which function you're talking about.

Casting valid int code points to char will work for code points in the basic multilingual plane just due to how UTF-16 was defined. To convert anything above U+FFFF you should use Character.toChars(int) to convert to UTF-16 code units. The algorithm is defined in RFC 2781.

The code point is just a number that maps to a character, there's no real conversion going on. Unicode code points are specified in hexadecimal, so whatever you codePoint is in hex will map to that character (or glyph).

Since a char is defined to hold UTF-16 data in Java, this is all there is to it. Only if the input is an int (i.e. it can represent a Unicode codepoint of U+10000 or greater) is some calculation necessary. All char values are already UTF-16.

All chars in Java are represented internally in UTF-16. This is just mapping the integer value to that char.

Also, char arrays are already UTF-16, in the Java platform.

继续阅读：character-encoding encoding utf utf-16

Java UTF-16 Encoding code

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？