开发者

What encoding does the System.Windows.Forms.RichTextBox use for unicode chars?

I've got a WinForms RichTextBox in my application. When I enter the Chinese text "蜜蜜蜜蜜", the control uses the following RTF:

{\rtf1\ansi\ansicpg125开发者_开发百科2\deff0\deflang1033{\fonttbl{\f0\fmodern\fprq6\fcharset134 SimSun;}{\f1\fnil\fcharset0 Microsoft Sans Serif;}} \viewkind4\uc1\pard\f0\fs17\'c3\'db\'c3\'db\'c3\'db\'c3\'db\f1\par }

The test string is the same character four times. It's Unicode value is 34588 (0x871C). So how is it that the character is being stored as "\'c3\'db" in the RTF? What kind of encoding is that?


RTF is old, older than Job and considerably predates Unicode. I think it using code page 936, a double-byte character set for Simplified Chinese. Your snippet shows it using c3db for the character, it matches the glyph shown in this table.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜