Detecting a "unique" anonymous user
It is impossible to identify a user or request as unique
since duping is trivial.
However, there are a handful of methods that, combined, can hamper cheating attempts and give a user quasi-unique status.
I know of the following:
- IP Address - store the IP address of each vis开发者_StackOverflowitor in a database of some sort
- Can be faked
- Multiple computers/users can have the same address
- Users with dynamic IP addresses (some ISP issue them)
- Cookie tracking - store a cookie per visitor. Visitors that don't have it are considered "unique"
- Can be faked
- Cookies can be blocked or cleared via browser
Are there more ways to track non-authorized (non-login, non-authentication) website visitors?
There are actually many ways you can detect a "unique" user. Many of these methods are used by our marketing friends. It get's even easier when you have plugins enabled such as Java, Flash etc.
Currently my favorite presentation of cookie based tracking is evercookie (http://samy.pl/evercookie/). It creates a "permanent" cookie via multiple storage mechanisms, the average user is not able to flush, specifically it uses:
- Standard HTTP Cookies
- Local Shared Objects (Flash Cookies)
- Silverlight Isolated Storage
- Storing cookies in RGB values of auto-generated, force-cached PNGs using HTML5 Canvas tag to read pixels (cookies) back out
- Storing cookies in Web History
- Storing cookies in HTTP ETags
- Storing cookies in Web cache
- window.name caching
- Internet Explorer userData storage
- HTML5 Session Storage
- HTML5 Local Storage
- HTML5 Global Storage
- HTML5 Database Storage via SQLite
I can't remember the URL, but there is also a site which tells you how "anonymous" you are based on everything it can gather from your web browser: What plugins you have loaded, what version, what language, screensize, ... Then you can leverage the plugins I was talking about earlier (Flash, Java, ...) to find out even more about the user. I'll edit this post when I find the page whcih showed you "how unique you are" or maybe somebody knows »» actually it looks as if every user is in a way unique!
--EDIT--
Found the page I was talking about: Panopticlick - "How Unique and trackable is your browser".
It collects stuff like User Agent, HTTP_ACCEPT headers, Browser Plugins, Time Zone, Screen Size and Depth, System Fonts (via Java?), Cookies...
My result: Your browser fingerprint appears to be unique among the 1,221,154 tested so far.
Panopticlick has a quite refined method for checking for unique users using fingerprinting. Apart from IP-adress and user-agent it used things like timezone, screen resolution, fonts installed on the system and plugins installed in the browser etc, so it comes up with a very distinct ID for each and every user without storing anything in their computers. False negatives (finding two different users with the exact same fingerprint) are very rare.
A problem with that approach is that it can yield some false positive, i.e. it considers the same user to be a new one if they've installed a new font for example. If this is ok or not depends on your application I suppose.
Yes, it's impossible to tell anonymous visitors apart with 100% certainty. The best that you can do is to gather the information that you have, and try to tell as many visitors apart as you can.
There is one more piece of infomration that you can use:
- Browser string
- It's not unique, but in combination with the other information it increases the resolution.
If you need to tell the visitors apart with 100% certainty, then you need to make them log in.
There is no sure-fire way to achieve this, in my view. Of your options, cookies are the most likely to yield a reasonably realistic number. NATing and proxy servers can mask the IP addresses of a large number of users, and dynamic IP address allocation will confuse the results for a lot of others
Have you considered using e.g Google Analytics or similar? They do unique visitor tracking as part of their service, and they probably have a lot more money to throw at finding heuristic solutions to this problem than you or I. Just a thought!
精彩评论