HTML: can I protect some pages from Google using user authentication?
A section of my website is accessible only for authenticated users. I was wondering if this pages are crawled by google, or they are kinda "hidden" t开发者_C百科o the search engine.
thanks
If they are closed to users who are not authenticated, they are of course also closed to Google. The Google bot is nothing but another client trying to access your site.
Some sites like Newspapers have content that is reserved to paying users, but they are visible in search engines. That is always a conscious act on the side of the web master to open up the site to search engine bots, even though they are not paying customers.
Search engines have no "special key" to get into the house.
If you are still in question, you may query google with "site:yourside.com" and check the result pages.
If your site has links to the pages which require authentication, then, yes, Google will attempt to crawl it. It is down to you to ensure that unauthenticated users are not served.
As Greenie suggests, use the Robots.txt file to tell search engines not to attempt to crawl your protected content.
Remember that obeying the instructions in Robots.txt is voluntary. There is nothing to stop a web crawler from actually requesting such content, and if so, a Robots.txt file could be equivalent to a message on the front door saying "Valuable stuff here!!".
As a web crawler is just another client trying to access your site then the authenticated area will be inaccessible to the crawler too.
If you want to tell web crawlers not to indexing other parts of your website, use a file called robots.txt that you place in the root directory of your site. For example:
robots.txt
User-agent: *
Disallow: /hidden
This will tell all web crawlers not to indexing content inside the directory 'hidden'.
精彩评论