X.ToCharArray().Length EQUALS GetBytes(X).Length
string s = "test";
int charCount = s.ToCharArray().Length;
int byteCount = System.Text.Encoding.Default.GetBytes(s).Length;
When can (charCount != byteCount) happen? I believe in case of Unicode characters but not in gener开发者_JS百科al case.
.NET supports Unicode characters but is that the default(System.Text.Encoding.Default) encoding for .NET? "System.Text.Encoding.Default" shows "System.Text.SBCSCodePageEncoding" as the encoding which is single byte.
The default encoding is UTF8 which can contain 1-4 bytes of space per character.
charCount and byteCount will not be equal when any character in string s uses more than 1 byte.
To force the use of 4 bytes you can check using the Unicode encoding, then byteCount will = 8.
int byteCount = System.Text.Encoding.Unicode.GetBytes(s).Length;
The character count will be different from the byte count whenever you use an encoding that uses more than one byte per character. This is the case for several encodings, including UTF-16 (the internal representation of .NET strings) and UTF-32.
精彩评论