开发者

Yahoo Web Scrapes: What are the limits?

We are using a web scraper and have it set up to have a sleep function which has a ra开发者_如何学编程ndom function set up (so that it isn't the same time between each scrape) but we are still getting blocked from Yahoo after 20-30 requests.

Does any one know if there is a limit (i.e: 20 requests per minutes, 200 an hour) Right now our average between each request is around 3-6 seconds. Thanks for any help


1 request every 3-6 seconds is quite low so perhaps there is another problem with your crawler.

A few ideas:

  • set the User-Agent to something non-suspicious
  • set the Referer header to the same domain
  • try running your crawler from a different IP in case your current IP is blacklisted
  • try maintaining cookies

This will all be easier if you use a higher level library like Mechanize.


So the answer is 5000 queries. Taken from

http://forums.digitalpoint.com/showthread.php?t=736784

http:// developer. yahoo. com/search/rate.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜