开发者

Converting C-Strings from Local Encoding to UTF8

I'm writing a small App in which i read some text from to console, which is then stored in a classic char* string.

As it happens i need to pass it to an lib which only takes UTF-8 encoded Strings. Since the Windows console uses the local Encoding, i need to convert from local encoding to UTF-8.

If i'm not mistaken i could use MultiByteToWideChar(..) to encode to UTF-16 and then use WideCharToMultiByte(..) to Convert to UTF-8.

However i wonder if there is a w开发者_开发知识库ay to convert directly from local Encoding to UTF-8 without the use of any external Libs, since the idea of converting to wchar just to be able to convert back to char (utf-8 encoded but still) seems kinda weird to me.


Converting from UTF-16 to UTF-8 is purely a mechanical process, but converting from local encoding to UTF-16 or UTF-8 involves some large specialized lookup tables. The c-runtime just turns around and calls WideCharToMultiByte and MultiByteToWideChar for non-trivial cases.

As for having to use UTF-16 as an intermediate stage, as far as I know, there isn't any way around that - sorry.

Since you are already linking to an external library to get file input, you might as well link to the same library to get WideCharToMultiByte and MultiByteToWideChar.

Using the c-runtime will make your code re-compilable to other operating systems (in theory), but it also adds a layer of overhead between you and the library that does all of the real work in this case - kernel32.dll.


The POSIX world loves the iconv lib for just that. It converts from and to virtually every encoding around, using char*.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜