Python: Passing unicode string to C++ module
I'm working with an existing module at the moment that provides a C++ interface and does a few operations with strings.
I ne开发者_Python百科eded to use Unicode strings and the module unfortunately didn't have any support for a Unicode interface, so I wrote an extra function to add to the interface:
void SomeUnicodeFunction(const wchar_t* string)
However, when I attempt to use the following code in Python:
SomeModule.SomeUnicodeFunction(ctypes.c_wchar_p(unicode_string))
I get this error:
ArgumentError: Python argument types in
SomeModule.SomeUnicodeFunction(SomeModule, c_wchar_p)
did not match C++ signature:
SomeUnicodeFunction(... {lvalue}, wchar_t const*)
(names have been changed).
I've tried changing wchar_t in the C++ module to Py_UNICODE with no success. How do I solve this problem?
For Linux you don't have to change your API, just do:
SomeModule.SomeFunction(str(s.encode('utf-8')))
On Windows all Unicode APIs are using UTF-16 LE (Little Endian) so you have to encode it this way:
SomeModule.SomeFunctionW(str(s.encode('utf-16-le')))
Good to know: wchar_t can have different sizes on different platforms: 8, 16 or 32 bits.
Found a hack to work around the problem:
SomeModule.SomeUnicodeFunction(str(s.encode('utf-8')))
It seems to be working fine for my purposes so far.
Update: Actually, using UTF-8 means I avoid any need for SomeUnicodeFunction and can use the standard SomeFunction without specialising for unicode. Learn something new every day I guess :).
精彩评论