开发者

small php code for detecting real search spiders from a spammer

hi I just want your opinions about this code I found on a website for detect real search spiders from spammer is it good?? and do you have any recommendations for other scripts or methods for this subject

<?php 
$ua = $_SERVER['HTTP_USER_AGENT'];

$spiders=array('msnbot','googlebot','yahoo');

$pattern=array("/\.google\.com$/","/search\.live\.com$/","/\.yahoo\.com$/");

for($i=0;$i < count($spiders) and $i < count($pattern);$i++)

{

  if(stristr($ua, $spiders[$i])){

    //it's pretending to be MSN's bot or Google's bot

    $ip = $_SERVER['REMOTE_ADDR'];

    $hostname = gethostbyaddr($ip);



    if(!preg_match($pattern[$i], $hostname))

    {

      //the hostname does not belong to either live.com or googlebot.com.

      //Remember the UA a开发者_如何转开发lready said it is either MSNBot or Googlebot.

      //So it's a spammer.

      echo "spammer";

      exit;

    }

    else{

      //Now we have a hit that half-passes the check. One last go:

      $real_ip = gethostbyname($hostname);

      if($ip != $real_ip){

        //spammer!

        echo "Please leave Now spammr";

        break;

      }

      else{

        //real bot

      }

    }

  }

  else

  {

    echo "hello user";

  }

}

note: it used user agent switcher with this code and it worked perfectly but am not sure if it will work in real world, so what do you think??


What would keep a spammer from simply giving an entirely correct user agent string?

I think this is fairly pointless. You would have to at least compare IP ranges (or their name servers) as well in order to get reliable results. This is possible for Google:

Google Webmaster Central: How to verify Googlebot

but even if you test for Google and Bing this way, a spambot can enter your site simply by giving a browser user-agent. Therefore, it is ultimately impossible to detect a spam-bot. They are a reality, and there is no good way to keep them out from a web site.


you can also have htaccess so that things like this will be prevented just like on this tutorial http://perishablepress.com/press/2007/06/28/ultimate-htaccess-blacklist/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜