Prevent Googlebot from running a function
We have implemented a new Number of Visits function on our site that saves a row in our Views database when a company profile on our site is accessed. This is done using a server-side "/addVisit" function that is run each time a page (company profile) is loaded. Unfortunately, this means we had 400+ Visits from Googlebot last night.
Since we do want Google to index these pages, we can't exclude Googlebot on these pages using robots.txt.
I have also read t开发者_如何学Pythonhat running this function using a jQuery $.get() will not stop Googlebot.
Is the only working solution is to exclude known bot IPs or are there options?
Or possibly using a jQuery $.get(/addVisit) with a robots.txt exclude /addVisit will stop googlebot and other bots from running this function?
Create a robots.txt
file in the root directory of your website, and add:
User-agent: Google
Disallow: /addVisit
You can also use *
instead of Google
, so that /addvisit
doesn't get indexed at by any engine. Search engines start always looking for /robots.txt
. If this file exists, they parse the contents and respect the applied restrictions.
For more information, see http://www.robotstxt.org/robotstxt.html.
If you're handling your count by a server side HTTP request, you could filter any user agents that contain the word 'Googlebot'. A quick Google search shows me a couple of Googlebot user agent examples:
Googlebot/2.1 (+http://www.google.com/bot.html)
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
精彩评论