开发者

Unicode surrogates character encoding c#

I've got problem with Unicode characters. When I want to encode surrogates character (between D800 and DFFF) it encodes as FFFD. I used Encoding.Unicode.GetString() method it doesn't work and Decoder.GetChars() method it doesnt work with every su开发者_Go百科rrogate character.

I use following codes:

Encoding Codes:

string unicodeChars="a\uD800\uDA65";
FileStream stream=new FileStream (@"unicode_encoding.txt",FileMode.Create,FileAccess.Write);
byte[] buffer=Encoding.Unicode.GetBytes(unicodeChars);

stream.Write(buffer,0,buffer.Length);
stream.Close();

Decoding Codes:

string decodedUnicodeChars;
FileStream stream2=new FileStream (@"unicode_encoding.txt",FileMode.Open,FileAccess.Read);
StreamReader reader=new StreamReader(stream2,Encoding.Unicode);

decodedUnicodeChars=reader.ReadToEnd();

foreach(char c in decodedUnicodeChars)
{
   Console.Write("{0} ",Convert.ToInt32(c).ToString("X4"));
}

Output is:

0061 FFFD FFFD


 string unicodeChars="a\uD800\uD565";

This is a case of gigo, Garbage In, Garbage Out. The surrogate is not valid, the second one must be in the range \uDC00..\uDFFF.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜