Preventing html page to be crawled and cached by bot/crawler

2023-01-25 06:07 问答作者：

Is there any way to preven开发者_StackOverflow社区t web crawlers to crawl and cached my public facing web application website?

You can use Robots.txt:

User-agent: *
Disallow: /

But it's not 100% reliable, not all crawlers will respect this.

From what i have learned recently the only 100% reliable way is to make all your pages secure.

Robots.txt (as already suggested) prevents crawling. If you just want to prevent caching, add the following HTML to your <head> section:

<META NAME="ROBOTS" CONTENT="NOARCHIVE" />

Yes, create a robots.txt file in the root of your web site. There are lots of other interesting tutorials around.

Well a common way to stop search engines like google etc is to include a ROBOTS.TXT file in the root of your website.

Here is a good article on the subject http://www.javascriptkit.com/howto/robots.shtml

Stop crawlers

继续阅读：css

精彩评论