开发者

How to detect automated agents that are coping the content of my site?

I notice 开发者_高级运维that some sites are coping the content of one of my client's sites using automated agents. I want to detect their requests and show them a captcha code to prevent them from coping the site content.

Is there anyway to detect them?


This is a complex problem and a game of cat and mouse. To make it slightly difficult:

  1. Ban the IPs that are hitting the site repeatedly, a normal user would not need ALL the pages
  2. Ban public proxies, list is available on googleing
  3. Any request from banned IPs/Proxies should be redirected to captcha page


Typically an "automated agent" would be accessing a lot of data in a short period...more than a typical user. You would need to setup something to track ip addresses of all users and see if there is any such ip that stands out and block them.

Of course, this is made more difficult as there are proxies and dynamic ips etc...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜