开发者

Preventing html page to be crawled and cached by bot/crawler

Is there any way to preven开发者_StackOverflow社区t web crawlers to crawl and cached my public facing web application website?


You can use Robots.txt:

User-agent: *
Disallow: /

But it's not 100% reliable, not all crawlers will respect this.

From what i have learned recently the only 100% reliable way is to make all your pages secure.


Robots.txt (as already suggested) prevents crawling. If you just want to prevent caching, add the following HTML to your <head> section:

<META NAME="ROBOTS" CONTENT="NOARCHIVE" />


Yes, create a robots.txt file in the root of your web site. There are lots of other interesting tutorials around.


Well a common way to stop search engines like google etc is to include a ROBOTS.TXT file in the root of your website.

Here is a good article on the subject http://www.javascriptkit.com/howto/robots.shtml


Stop crawlers

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜