How to change a strings encoding as utf 8 in C
How can i change character encoding of a string to UTF-8? I am making some execv calls to a python program but python returns the strings with the some characters cut of. I don't know if this a python issue or c issue but i thought if i can change the strings encoding in c and th开发者_运维知识库en pass it to python, it should do the trick. So how can i do that?
Thanks.
C as a language does not facilitate string encoding. A C string is simply a null-terminated sequence of characters (8-bit signed integers, on most systems).
A wide string (with characters of type wchar_t
, typically 16-bit integers) can also be used to hold larger character values; however, again, C standard library functions and data types are in no way aware of any concept of string encoding.
The answer to your question is to ensure that the strings you're passing into Python are encoded as UTF-8.
In order to help you accomplish that in any detailed capacity, however, you will have to provide more information about how your strings are currently formed, what they contain, and how you're constructing your argument list for exec.
There is no such thing as character encoding in C.
A char*
can hold any data, how you interpret the characters is up to you. For instance, printf
will typically dump the characters as they are to the standard output, and if your console interprets those characters as UFT8, they'll appear as such.
If you want to convert between different encodings in the C side, you can have a look at ICU.
If you want to convert between encodings in the Python side, look at http://docs.python.org/howto/unicode.html.
精彩评论