Python - urllib2.urlopen - Why do I get garbled characters?
Here's my problem:
import urllib2
response=urllib2.urlopen('http://prox开发者_运维百科y-heaven.blogspot.com/')
html=response.read()
print html
It's just this site, and I don't know why the result is all garbled characters. Anyone can help?
Without your output it's hard to say but I'd bet it's an encoding issue : this website is encoded in utf8. If your terminal is set in iso-latin for example, it won't be possible for it to display characters properly.
Works for me:
import urllib
response=urllib.urlopen('http://proxy-heaven.blogspot.com/')
a = response.read()
print a[:50]
> '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Stric'
You may have an encoding problem in your terminal, though.
encoding may be your problem, in which case you want this code.
import urllib
s = str(urllib.urlopen('http://proxy-heaven.blogspot.com/').read(), encoding='utf8')
精彩评论