开发者

robots.txt disallow: spider

I'm looking at a robots.txt file of a site I would like to do a one off scrape and there is this line:

User-agent: spider

Disallow: /

Does this mean they don't want any spiders? I was under the impression that * was used for all spiders. If tru开发者_如何转开发e this would of-course stop spiders such as google.


This just tells to agents that call themselves spider to be gently enough to not browse the site.

This has no special meaning.

robots.txt files are used only by robots, so a way to exclude all robots is to use a *:

User-Agent: *
Disallow: /
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜