Preventing html page to be crawled and cached by bot/crawler
Is there any way to preven开发者_StackOverflow社区t web crawlers to crawl and cached my public facing web application website?
You can use Robots.txt:
User-agent: *
Disallow: /
But it's not 100% reliable, not all crawlers will respect this.
From what i have learned recently the only 100% reliable way is to make all your pages secure.
Robots.txt (as already suggested) prevents crawling. If you just want to prevent caching, add the following HTML to your <head> section:
<META NAME="ROBOTS" CONTENT="NOARCHIVE" />
Yes, create a robots.txt file in the root of your web site. There are lots of other interesting tutorials around.
Well a common way to stop search engines like google etc is to include a ROBOTS.TXT file in the root of your website.
Here is a good article on the subject http://www.javascriptkit.com/howto/robots.shtml
Stop crawlers
精彩评论