Convert url encoded string into python unicode string
I have strings encoded in the following form: La+Cit%C3%A9+De+la+West that I stored in a SQLite VARCHAR field in p开发者_如何学JAVAython.
These are apparently UTF-8 encoded binary strings converted to urlencoded strings. The question is how to convert it back to a unicode string. s = 'La+Cit%C3%A9+De+la+West'
I used the urllib.unquote_plus( s ) python function but it doesn't convert the %C3%A9 into a unicode char. I see this 'La Cité De la West' instead of the expected 'La Cité De la West'.
I'm running my code on Ubuntu, not windows and encoding is UTF-8.
As we discussed, it looks like the problem was that you were starting with a unicode object, not a string. You want a string:
>>> import urllib
>>> s1 = u'La+Cit%C3%A9+De+la+West'
>>> type(s1)
<type 'unicode'>
>>> print urllib.unquote_plus(s1)
La Cité De la West
>>> s2 = str(s1)
>>> type(s2)
<type 'str'>
>>> print urllib.unquote_plus(s2)
La Cité De la West
>>> import sys
>>> sys.stdout.encoding
'UTF-8'
精彩评论