How do you create a robots.txt file that blocks all but the root
How do you create a valid robots.txt file that blocks all crawler 开发者_StackOverflow社区requests except for the root, aka landing http://www.mysite.com
Assuming your default page for the root is named index.htm
, I believe this will accomplish what you're looking for.
User-agent: *
Allow: /index.htm
Disallow: /
Google's Webmaster Tools has some great help for formulating a robots.txt
and if you use the Webmaster Tools, you also get a robots.txt
builder/tester.
精彩评论