Size of wchar_t for unicode encoding
is there 32-bit wide character for encoding UTF-32 strings? I'd like to do it via std::wstring
w开发者_运维百科hich apparently showing me size of a wide character is 16 bits on windows platform.
You won't be able to do it with std::wstring
on many platforms because it will have 16 bit elements.
Instead you should use std::basic_string<char32_t>
, but this requires a compiler with some C++0x support.
The size of wchar_t
is platform-dependent and it is independent of UTF-8, UTF-16, and UTF-32 (it can be used to represent unicode data, but there is nothing that says that it represents that).
I strongly recommend using UTF-8 with std::string
for internal string representation, and using established libraries such as ICU for complex manipulation and conversion tasks involving unicode.
Just use typedef
!
It would look something like this:
typedef int char_32;
And use it like this:
char_32 myChar;
or as a c-string:
char_32* string_of_32_bit_char = "Hello World";
The modern answer for this is to use char32_t
(c++11) which can be used with std::u32string
. However, in reality, you should just use std::string
with a encoding like UTF-8. Note that the old answer to char32_t
would be using templates or macros to determine which unsigned integral type has size of 4
bytes, and use that.
精彩评论