开发者

robots.txt ignrore all folders but crawl all files in root

should i then do

开发者_运维知识库

User-agent: *

Disallow: /

is it as simple as that? or will that not crawl the files in the root either?

basically that is what i am after - crawling all the files/pages in the root, but not any of the folders at all or am i going to have to specify each folder explicitly.. ie

disallow: /admin

disallow: /this

.. etc

thanks

nat


Your example will block all all the files in root.

There isn't a "standard" way to easily do what you want without specifying each folder explicitly.

Some crawlers however do support extensions that will allow you to do pattern matching. You could disallow all bots that don't support the pattern matching, but allow those that do.

For example

# disallow all robots
User-agent: *
Disallow: /

# let google read html and files
User-agent: Googlebot
Allow: /*.html
Allow: /*.pdf
Disallow: /
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜