why do char takes 2 bytes as it can be stored in one byte

2023-03-21 21:58 问答作者：

can anybody tell me that in c# why does char takes two bytes although it can be stored in one byte. Don't you think it is wastage of a开发者_如何学Go memory. if not , then how is extra 1-byte used? in simple words ..please make me clear what is the use of extra 8-bits.!!

although it can be stored in one byte

What makes you think that?

It only takes one byte to represent every character in the English language, but other languages use other characters. Consider the number of different alphabets (Latin, Chinese, Arabic, Cyrillic...), and the number of symbols in each of these alphabets (not only letters or digits, but also punctuation marks and other special symbols)... there are tens of thousands of different symbols in use in the world ! So one byte is never going to be enough to represent them all, that's why the Unicode standard was created.

Unicode has several representations (UTF-8, UTF-16, UTF-32...). .NET strings use UTF-16, which takes two bytes per character (code points, actually). Of course, two bytes is still not enough to represent all the different symbols in the world; surrogate pairs are used to represent characters above U+FFFF

The char keyword is used to declare a Unicode character in the range indicated in the following table. Unicode characters are 16-bit characters used to represent most of the known written languages throughout the world.

http://msdn.microsoft.com/en-us/library/x9h8tsay%28v=vs.80%29.aspx

Unicode characters. True, we have enough room in 8bits for the English alphabet, but when it comes to Chinese and such, it takes a lot more characters.

In C#, char's are 16-bit Unicode characters by default. Unicode supports a much larger character set than can be supported by ASCII.

If memory really is a concern, here is a good discussion on SO regarding how you might work with 8-bit chars: Is there a string type with 8 BIT chars?

References:

On C#'s char datatype: http://msdn.microsoft.com/en-us/library/x9h8tsay(v=vs.80).aspx

On Unicode: http://en.wikipedia.org/wiki/Unicode

because utf-8 was probably still too young for microsoft to consider using it

why do char takes 2 bytes as it can be stored in one byte

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？