开发者

How to write a PHP script to find the number of indexed pages in Google?

I need to find the number of indexed pages in google for a specific domain name, how do we do that through a PHP script?

So,

    foreach ($al开发者_如何学运维lresponseresults as $responseresult)
    {
        $result[] = array(
            'url' => $responseresult['url'],
            'title' => $responseresult['title'],
            'abstract' => $responseresult['content'],
        );
    }

what do i add for the estimated number of results and how do i do that? i know it is (estimatedResultCount) but how do i add that? and i call the title for example this way: $result['title'] so how to get the number and how to print the number?

Thank you :)


I think it would be nicer to Google to use their RESTful Search API. See this URL for an example call:

http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:stackoverflow.com&filter=0

(You're interested in the estimatedResultCount value)

In PHP you can use file_get_contents to get the data and json_decode to parse it.

You can find documentation here:

http://code.google.com/apis/ajaxsearch/documentation/#fonje


Example

Warning: The following code does not have any kind of error checking on the response!

function getGoogleCount($domain) {
    $content = file_get_contents('http://ajax.googleapis.com/ajax/services/' .
        'search/web?v=1.0&filter=0&q=site:' . urlencode($domain));
    $data = json_decode($content);
    return intval($data->responseData->cursor->estimatedResultCount);
}

echo getGoogleCount('stackoverflow.com');


You'd load http://www.google.com/search?q=domaingoeshere.com with cURL and then parse the file looking for the results <p id="resultStats" bit.

You'd have the resulting html stored in a variable $html and then say something like

$arr = explode('<p id="resultStats"'>, $html);
$bottom = $arr[1];
$middle = explode('</p>', $bottom);

Please note that this is untested and a very rough example. You'd be better off parsing the html with a dedicated parser or matching the line with regular expressions.


google ajax api estimatedResultCount values doesn't give the right value. And trying to parse html result is not a good way because google blocks after several search.


Count the number of results for site:yourdomainhere.com - stackoverflow.com has about 830k


// This will give you the count what you see on search result on web page, 
//this code will give you the HTML content from file_get_contents

header('Content-Type: text/plain');

$url = "https://www.google.com/search?q=your url";
$html = file_get_contents($url);
if (FALSE === $html) {
    throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}

$arr = explode('<div class="sd" id="resultStats">', $html);
$bottom = $arr[1];
$middle = explode('</div>', $bottom);
echo $middle[0];

Output:
About 8,130 results

//vKj

Case 2: you can also use google api, but its count is different:
https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=ursitename&callback=processResults


https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:google.com

cursor":{"resultCount":"111,000,000"," "estimatedResultCount":"111000000",

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜