Can anyone tell me how to convert UTF-8 value to UCS-2 value in Objective-c?

2023-03-26 14:08 问答作者：

I am trying to convert UTF-8 string into UCS-2 string. I need to get string like "\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875". I have googled for about a month by now, but still there is no reference about converting UTF-8 to UCS-2. Please someone help me. Thx in advance.

EDIT: okay, maybe my explanation was not good enough. Here is what I am trying to do. I live in Korea, and I am trying to send a sms message using CTMessageCenter. I tried to send chinese simplified character through my app. And I get ???? Instead of proper characters. So I tried UTF开发者_如何转开发-8, UTF-16, BE and LE as well. But they all return ??. Finally I found out that SMS uses UCS-2 and EUC-KR encoding in Korea. Weird, isn't it? Anyway I tried to send string like \u4E3B\u9875 and it worked. So I need to convert string into UCS-2 encoding first and get the string literal from those strings.

Wikipedia:

The older UCS-2 (2-byte Universal Character Set) is a similar character encoding that was superseded by UTF-16 in version 2.0 of the Unicode standard in July 1996.2 It produces a fixed-length format by simply using the code point as the 16-bit code unit and produces exactly the same result as UTF-16 for 96.9% of all the code points in the range 0-0xFFFF, including all characters that had been assigned a value at that time.

IBM:

Since the UCS-2 standard is limited to 65,535 characters, and the data processing industry needs over 94,000 characters, the UCS-2 standard is in the process of being superseded by the Unicode UTF-16 standard.

However, because UTF-16 is a superset of the existing UCS-2 standard, you can develop your applications using the systems existing UCS-2 support as long as your applications treat the UCS-2 as if it were UTF-16.

uincode.org:

UCS-2 is obsolete terminology which refers to a Unicode implementation up to Unicode 1.1, before surrogate code points and UTF-16 were added to Version 2.0 of the standard. This term should now be avoided.

UCS-2 does not define a distinct data format, because UTF-16 and UCS-2 are identical for purposes of data exchange. Both are 16-bit, and have exactly the same code unit representation.

So, using the "UTF8toUnicode" transformation in most language libraries will produce UTF-16, which is essentially UCS-2. And simply extracting the 16-bit characters from an Objective-C string will accomplish the same thing.

In other words, the solution has been staring you in the face all along.

UCS-2 is not a valid Unicode encoding. UTF-8 is.

It is therefore impossible to convert UTF-8 into UCS-2 — and indeed, also the reverse.

UCS-2 is dead, ancient history. Let it rot in peace.

继续阅读：objective-c ucs2 unicode utf-8

Can anyone tell me how to convert UTF-8 value to UCS-2 value in Objective-c?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？