开发者

Google App Engine UrlFetch - problem with urls with special characters in them

I'm using Google Translate to convert a piece of text to speech with this url:

http://translate.google.com/translate_tts?tl=%s&q=%s

Where the parameter tl contains the language code of the language of the text you want converted to speech, and q contains the text you want converted.

Normal words (without special characters) return the correct audio file.

So in my application this is what I do (no is the language code for Norwegian):

url = "http://translate.google.com/translate_tts?tl=%s&q=%s" % ('no', urllib.quote('kjendis'))
#url = http://translate.google.com/translate_tts?tl=no&q=kjendis
self.response.headers["Content-Type"] = "audio/mpeg"
self.response.out.write(urlfetch开发者_运维知识库.fetch(url).content)

This returns the correct sound.

I'm using plain webapp btw.

But when I have a word with a special character in it (vår) something isn't right. The url generated is http://translate.google.com/translate_tts?tl=no&q=v%C3%A5r. (the å is correctly transformed to percent encoding)

When opening that url with my browser I get the correct sound, but when using urlfetch.fetch to read the same url the sound returned is not correct.

What is going wrong here? I can only assume that fetch is altering the url somehow.


Apparently the problem is not an App Engine problem, but it has to do with the way the Google Translate url handles different user agents.

An example:

#!/usr/bin/env python
#coding=utf-8

import urllib

class MyOpener(urllib.FancyURLopener):
    version = "App/1.7" #doesn't work
    version = "Mozilla/4.0 (MSIE 6.0; Windows NT 5.0)2011-03-10 15:38:34" #works

def textToSpeech(text, languageCode='en'):
    url = "http://translate.google.com/translate_tts?tl=%s&q=%s" % (languageCode, urllib.quote(text))
    myopener = MyOpener()
    return myopener.open(url, 'rb').read()

open('urllib.mp3', 'wb').write(textToSpeech('vår', 'no'))

When using the Firefox user agent string for MyOpener everything works as expected, but when using the other user string the sound returned is not correct.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜