开发者

Using CURL with Google

I want to CURL to Google to see how many results it returns for a certain search.

I've tried this:

  $url = "http://www.google.com/search?q=".$strSearch."&hl=en&start=0&sa=N";
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_HEADER, 0);
  curl_setopt($ch, CURLOPT_VERBOSE, 0);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl开发者_C百科_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_POST, true);
  $response = curl_exec($ch);
  curl_close($ch);

But it just returns a 405 Method Allowed google error.

Any ideas?

Thanks


Use a GET request instead of a POST request. That is, get rid of

curl_setopt($ch, CURLOPT_POST, true);

Or even better, use their well defined search API instead of screen-scraping.


Scrapping Google is a very easy thing to do. However, if you don't require more than the first 30 results, then the search API is preferable (as others have suggested). Otherwise, here's some sample code. I've ripped this out of a couple of classes that I'm using so it might not be totally functional as is, but you should get the idea.

function queryToUrl($query, $start=null, $perPage=100, $country="US") {
    return "http://www.google.com/search?" . $this->_helpers->url->buildQuery(array(
        // Query
        "q"     => urlencode($query),
        // Country (geolocation presumably)
        "gl"    => $country,
        // Start offset
        "start" => $start,
        // Number of result to a page
        "num"   => $perPage
    ), true);
}

// Find first 100 result for "pizza" in Canada
$ch = curl_init(queryToUrl("pizza", 0, 100, "CA"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT,      $this->getUserAgent(/*$proxyIp*/));
curl_setopt($ch, CURLOPT_MAXREDIRS,      4);
curl_setopt($ch, CURLOPT_TIMEOUT,        5);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

$response = curl_exec($ch);

Note: $this->_helpers->url->buildQuery() is identical to http_build_query except that it will drop empty parameters.


Use the Google Ajax API.

http://code.google.com/apis/ajaxsearch/

See this thread for how to get the number of results. While it refers to c# libraries, it might give you some pointers.


Before scrapping data please read https://support.google.com/websearch/answer/86640?rd=1

Against google terms

Automated traffic includes:

Sending searches from a robot, computer program, automated service, or search scraper Using software that sends searches to Google to see how a website or webpage ranks on Google


CURLOPT_CUSTOMREQUEST => ($post)? "POST" : "GET"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜