urllib2 gives HTTP Error 400: Bad Request for certain urls, works for others
I'm trying to do a simple HTTP get request with Python's urllib2 module. It works sometimes, but other times I get HTTP Error 400: Bad Request
. I know it's not an issue with the URL, because if I use urllib
and simply do urllib.urlopen(url)
it works fine - but when I add headers and do urllib2.urlopen()
I get Bad Request on certain sites.
Here is the code that's not working:
# -*- coding: utf-8 -*-
import re,sys,urllib,urllib2
url = "http://www.gamestop.com/"
headers = {'User-Agent:':'Mozilla/5.0'}
req = urllib2.Request(url,None,headers)
response = urllib2.urlopen(req,None)
html1 = response.read()
(gamestop.开发者_C百科com is an example of a URL that does not work)
Some different sites work, some don't, so I'm not sure what I'm doing wrong here. Am I missing some important headers? Making the request incorrectly? Using the wrong User-Agent? (I also tried using the exact User-Agent of my browser, and that didn't fix anything)
Thanks!
You've got an extra colon in your headers.
headers = { 'User-Agent:': 'Mozilla/5.0' }
Should be:
headers = { 'User-Agent': 'Mozilla/5.0' }
精彩评论