开发者

PHP script to show google ranking results

does anyone know if it is possi开发者_Python百科ble to display google page rank of a particular website using php script?

if it is possible, how do i do it?


Okay, i re-wrote my Answer and extracted only the relevant part of my SEO Helper (my previous version had other stuff like Alexa Rank, Google Index, Yahoo Links etc in it. If you are looking for that, just see check an older revision of this answer!)

Please be aware that there are pages that have NO PAGERANK and by no I DON'T MEAN ZERO. There is just none. This may be because the page is so very unimportant (even less im portant than PR 0) or just so new but might very well be important. This is consiedered the same as PR 0 in my class!

This has some pros and some cons. If possible you should handle it seperately in your logic, but this is not always possible, so 0 is the next best approach.

Furthermore:

This code is reverse engeneered and does not utilize some sort of API that has any form of SLA or whatever. So it might stop working ANY TIME!

And PLEASE DONT FLOOD GOOGLE!

I made the test. If you have only a very short period of sleep, google blocks you after 1000 requests (for quite some time!). With a random sleep between 1.5 and 2 secs it looks fine.

I once crawled the pagerank for 70k pages. Only once, because I just needed it. I did only 5k a day from several IPs and now i have the data and It doesnt get outdated because the pages are there for decades.

IMO its totally OK to check a pagerank once in a while or even some at once, but dont miss-use this code or google may lock us out all together!

<?php
/*
 * @author Joe Hopfgartner <joe@2x.to>
 */
class Helper_Seo
{

    protected function _pageRankStrToNum($Str,$Check,$Magic) {
        $Int32Unit=4294967296;
        // 2^32
        $length=strlen($Str);
        for($i=0;$i<$length;$i++) {
            $Check*=$Magic;
            //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31),
            // the result of converting to integer is undefined
            if($Check>=$Int32Unit) {
                $Check=($Check-$Int32Unit*(int)($Check/$Int32Unit));
                //if the check less than -2^31
                $Check=($Check<-2147483648)?($Check+$Int32Unit):$Check;
            }
            $Check+=ord($Str {
                $i
            });
        }
        return $Check;
    }
    /* 
    * Genearate a hash for a url
    */
    protected function _pageRankHashURL($String) {
        $Check1=self::_pageRankStrToNum($String,0x1505,0x21);
        $Check2=self::_pageRankStrToNum($String,0,0x1003F);
        $Check1>>=2;
        $Check1=(($Check1>>4)&0x3FFFFC0)|($Check1&0x3F);
        $Check1=(($Check1>>4)&0x3FFC00)|($Check1&0x3FF);
        $Check1=(($Check1>>4)&0x3C000)|($Check1&0x3FFF);
        $T1=(((($Check1&0x3C0)<<4)|($Check1&0x3C))<<2)|($Check2&0xF0F);
        $T2=(((($Check1&0xFFFFC000)<<4)|($Check1&0x3C00))<<0xA)|($Check2&0xF0F0000);
        return($T1|$T2);
    }
    /* 
    * genearate a checksum for the hash string
    */
    protected function CheckHash($Hashnum) {
        $CheckByte=0;
        $Flag=0;
        $HashStr=sprintf('%u',$Hashnum);
        $length=strlen($HashStr);
        for($i=$length-1;$i>=0;$i--) {
            $Re=$HashStr {
                $i
            };
            if(1===($Flag%2)) {
                $Re+=$Re;
                $Re=(int)($Re/10)+($Re%10);
            }
            $CheckByte+=$Re;
            $Flag++;
        }
        $CheckByte%=10;
        if(0!==$CheckByte) {
            $CheckByte=10-$CheckByte;
            if(1===($Flag%2)) {
                if(1===($CheckByte%2)) {
                    $CheckByte+=9;
                }
                $CheckByte>>=1;
            }
        }
        return '7'.$CheckByte.$HashStr;
    }
    public static function getPageRank($url) {
        $fp=fsockopen("toolbarqueries.google.com",80,$errno,$errstr,30);
        if(!$fp) {
            trigger_error("$errstr ($errno)<br />\n");
            return false;
        }
        else {
            $out="GET /search?client=navclient-auto&ch=".self::CheckHash(self::_pageRankHashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0 HTTP/1.1\r\n";
            $out.="Host: toolbarqueries.google.com\r\n";
            $out.="User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
            $out.="Connection: Close\r\n\r\n";
            fwrite($fp,$out);
            #echo " U: http://toolbarqueries.google.com/search?client=navclient-auto&ch=".$this->CheckHash($this->_pageRankHashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0";
            #echo "\n";
            //$pagerank = substr(fgets($fp, 128), 4);
            //echo $pagerank;
            #echo "DATA:\n\n";
            $responseOK = false;
            $response = "";
            $inhead = true;
            $body = "";
            while(!feof($fp)) {

                $data=fgets($fp,128);

                if($data == "\r\n" && $inhead) {
                    $inhead = false;
                } else {
                    if(!$inhead) {
                        $body.= $data;
                    }
                }

                //if($data == '\r\n\r\n')
                $response .= $data;
                if(trim($data) == 'HTTP/1.1 200 OK') {
                    $responseOK = true;
                } 

                #echo "D ".$data;
                $pos=strpos($data,"Rank_");
                if($pos===false) {
                }
                else {
                    $pagerank=trim(substr($data,$pos+9));
                    if($pagerank === '0') {
                            fclose($fp);
                            return 0;
                    } else if(intval($pagerank) === 0) {
                        throw new Exception('couldnt get pagerank from string: '.$pagerank);
                        //trigger_error('couldnt get pagerank from string: '.$pagerank);
                        fclose($fp);
                        return false;
                    } else {
                        fclose($fp);
                        return intval( $pagerank );
                    }
                }
            }
            fclose($fp);


            //var_dump($body);
            if($responseOK && $body=='') {
                return 0;
            }
            //return 0;
            throw new Exception('couldnt get pagerank, unknown error. probably google flood block. my tests showed that 1req/sec is okay! i recommend a random sleep between 1.5 and 2 secs. no sleep breaks at ~1000 reqs.');
            //trigger_error('couldnt get pagerank, unknown error. probably google flood block.');
            return false;
        }
    }

}
$url = "http://www.2xfun.de/";
$pagerank = Helper_Seo::getPagerank($url);
var_dump($pagerank); 
?>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜