开发者

access to google with python

how i can access to google !!

i had try开发者_Python百科 that code

urllib.urlopen('http://www.google.com')

but it's show message prove you are human or some think like dat

some people say try user agent !! i dunno !


You should use the Google API for accessing the search. Here's an example for python. Unutbu provided a link to an older SO answer which contains a corrected version of the same example code.

#!/usr/bin/python
import urllib, urllib2
import json

api_key, userip = None, None
query = {'q' : 'search google python api'}
referrer = "https://stackoverflow.com/q/3900610"

if userip:
    query.update(userip=userip)
if api_key:
    query.update(key=api_key)

url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % (
    urllib.urlencode(query))

request = urllib2.Request(url, headers=dict(Referer=referrer))
json = json.load(urllib2.urlopen(request))

results = json['responseData']['results']
for r in results:
  print r['title'] + ": " + r['url']


A user agent string is indeed the way to go... pick any valid user agent from any common browser. In python 2.x, the following code should give you what you want:

import urllib2
r = urllib2.Request('http://www.google.com/')
r.add_header('User-Agent', 
             'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.19) '
             'Gecko/20081202 Firefox (Debian-2.0.0.19-0etch1)')
html = urllib2.urlopen(r).read()

Having said that, unutbu's recommendation to use the google search API (if you're looking to do searches) is by far the better way to go... avoids all that messy HTML parsing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜