开发者

How do I keep my app from tracking bot requests as views

This is a general question about writing web apps.

I have an application that counts page views of articles as well as a url shortner script that I've installed for a client of mine. The problem is that, whenever bots hit the site, they tend to inflate the page views.

Does anyone have an idea on how to go about eliminat开发者_Python百科ing bot views from the view count of these applications?


There are a few ways you could determine whether your articles are being viewed by an actual user or by a search engine bot. Probably the best way is to check the User-Agent header sent by the browser (or bot). The User-Agent header is essentially a field that is sent identifying the client application used to access the resource. For example, Internet Explorer might send something Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US). Google's bot might send something like Googlebot/2.1 (+http://www.google.com/bot.html). It is possible to send a fake User-Agent header, but I can't see the average site user or a major company like Google doing that. If it's blank or a common User-Agent string associated with a commercial bot, it's most likely a bot.

While you're at it, you may want to make sure you have an up-to-date robots.txt file. It's a simple text file that provides rules automated bots should respect in terms of which content they are not allowed to retrieve for indexing.

Here's a few resources that may be helpful:

  • List of User-Agents
  • How to Verify Googlebot
  • Web Robots Page
  • How do I stop bots from incrementing my file download counter in PHP?


Check User-Agent. Use this header value to distinguish bots from regular browsers/users.

For example,

Google bot:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Safari:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; lv-lv) AppleWebKit/531.22.7 (KHTML, like Gecko) Version/4.0.5 Safari/531.22.7
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜