开发者

Error with urlencode in python

I have this:

a = {'album': u'Metamorphine', 'group': 'monoku', 'name': u'Son Of Venus (Danny\xb4s Song)', 'artist': u'Leandra', 'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651', 'track_number': 8, 'year': '2008', 'genre': 'Darkwave', 'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Danny\xb4s Song).mp3', 'user_email': 'diegueus9@gmail.com', 'size': 6624104}
data = urllib.urlencode(mp3_data)

And that raise an exception:

Traceback (most recent call last):
  File "playkud.py", line 44, in <module>
    main()
  File "playkud.py", line 29, in main
    craw(args, options.user_email, options.group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 76, in craw
    index(root, file, data, user_email, group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 58, in index
    done = add_song(data[mp3file])
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 32, in add_song
    return make_request(URL+'add_song/', data)
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 14, in make_request
    data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))
  File "/usr/lib/python2.5/urllib.py", line 1250, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 19: ordinal not in range(128)

and with ipython (2.5):

In [7]: urllib.urlencode(a)
UnicodeEncodeError                        Traceback (most recent call last)

/home/diegueus9/<ipython console> in <module>()

/usr/lib/python2.5/urllib.pyc in urlencode(query, doseq)
   1248         for k, v in query:
   1249    开发者_如何学运维         k = quote_plus(str(k))
-> 1250             v = quote_plus(str(v))
   1251             l.append(k + '=' + v)
   1252     else:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 19: ordinal not in range(128)

How i can fix it?


The urlencode library expects data in str format, and doesn't deal well with Unicode data since it doesn't provide a way to specify an encoding. Try this instead:

mp3_data = {'album': u'Metamorphine',
     'group': 'monoku',
     'name': u'Son Of Venus (Danny\xb4s Song)',
     'artist': u'Leandra',
     'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651',
     'track_number': 8,
     'year': '2008', 'genre': 'Darkwave',
     'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Danny\xb4s Song).mp3',
     'user_email': 'diegueus9@gmail.com',
     'size': 6624104}

str_mp3_data = {}
for k, v in mp3_data.iteritems():
    str_mp3_data[k] = unicode(v).encode('utf-8')
data = urllib.urlencode(str_mp3_data)

What I did was ensure that all data is encoded into str using UTF-8 before passing the dictionary into the urlencode function.


If you are using Django, take a look at Django's QueryDict class, it has a urlencode() method.

Or, for the helper function itself you may use urlencode. It basically does what is described in the other answers as a wrapper around the original urllib.encode.


The problem is that some of the values in your mp3_data dict are unicode strings that can't be represented in the default encoding used by urlencode() (while others are ASCII and still others are integers). You can fix this by encoding those values before passing them to urlencode(). On line 14 of /home/diegueus9/workspace/playku/src/client/playkud/service.py, in make_request(), try changing this:

data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))

to this:

data = urllib.urlencode(dict([k.encode('utf-8'),unicode(v).encode('utf-8')] for k,v in mp3_data.items()))


the problem is, that you want to cast a unicode-string to a string, but there are some characters that have to be converted to ASCII first. So I would advice you to search for strings that are not ASCII and then encode them as follows:

try to change for example where v is a unicode-string to:

quote_plus(str(v))

to

quote_plus(str(v.encode("utf-8")))

that should help


If you do not have to use Python 2.x, you could switch to Python 3.x, where all strings are unicode by default. But you have to convert some things for it (you could automate this party or full with 2to3).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜