Preventing Site from Being Indexed by Search Engines
How can I prevent开发者_StackOverflow Google and other search engines from indexing my website?
I realize this is a very old question, but I wanted to highlight the comment made by @Julien as an actual answer.
According to Joost de Valk, robots.txt will indeed prevent your site from being crawled by search engines, but links to your site may still appear in search results if other sites have links that point to your site.
The solution is either adding a robots meta tag to the header of your pages:
<meta name="robots" content="noindex,nofollow"/>
Or, a simpler option is to add the following to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
Obviously your web host has to allow .htaccess rules and have the mod_headers
module installed for that to work.
Both of these tags keep search engines from following links that point to your site AND displaying your pages in search results. Win-Win, baby.
Create a robots.txt
file in your site root with the following content:
# robots.txt for yoursite
User-agent: *
Disallow: /
Search engines (and most robots in general) will respect the contents of this file. You can put any number of Disallow: /path
lines for robots to ignore. More details at robotstxt.org.
精彩评论