robots.txt ignrore all folders but crawl all files in root
should i then do
开发者_运维知识库User-agent: *
Disallow: /
is it as simple as that? or will that not crawl the files in the root either?
basically that is what i am after - crawling all the files/pages in the root, but not any of the folders at all or am i going to have to specify each folder explicitly.. ie
disallow: /admin
disallow: /this
.. etc
thanks
nat
Your example will block all all the files in root.
There isn't a "standard" way to easily do what you want without specifying each folder explicitly.
Some crawlers however do support extensions that will allow you to do pattern matching. You could disallow all bots that don't support the pattern matching, but allow those that do.
For example
# disallow all robots
User-agent: *
Disallow: /
# let google read html and files
User-agent: Googlebot
Allow: /*.html
Allow: /*.pdf
Disallow: /
精彩评论