Make a link completely invisible?
I'm pretty sure that many people have thought of this, but for some reason I can't find it using Google and StackOverflow search.
I would like to make an invisible link (blacklisted by robots.txt) to a CGI or PHP page that will "trap" malicious bots and spiders. So far, I've tried:
Empty links in the body:
<a href='/trap'><!-- nothing --></a>
This works quite nicely most of the time, with two minor problems:
Problem: The link is part of the body of the document. Even though it is pretty much unclickable with a mouse, some visitors still inadvertently hit it while keyboard-navigating the site with Tab and Enter. Also, if they copy-paste the page into a word processor or e-mail software, for example, the trap link is copied along and sometimes even clickable (some software don't like empty
<a>
tags and copy the href as the contents of the tag).Invisible blocks in the body:
<div style="display:none"><a href='/trap'><!-- nothing --></a></div>
This fixes the problem with keyboard navigation, at least in the browsers I tested. The link is effectively inaccessible from the normal display of the page, while still fully visible to mo开发者_C百科st spider bots with their current level of intelligence.
Problem: The link is still part of the DOM. If the user copy-paste the contents of the page, it reappears.
Inside comment blocks:
<!-- <a href='/trap'>trap</a> -->
This effectively removes the link from the DOM of the page. Well, technically, the comment is still part of the DOM, but it achieves the desired effect that compliant user-agents won't generate the A element, so it is not an actual link.
Problem: Most spider bots nowadays are smart enough to parse (X)HTML and ignore comments. I've personally seen bots that use Internet Explorer COM/ActiveX objects to parse the (X)HTML and extract all links through XPath or Javascript. These types of bots are not tricked into following the trap hyperlink.
I was using method #3 until last night, when I was hit by a swarm of bots that seem to be really selective on which links they follow. Now I'm back to method #2, but I'm still looking for a more effective way.
Any suggestions, or another different solution that I missed?
Add it like you said:
<a id="trap" href='/trap'><!-- nothing --></a>
And then remove it with javascript/jQuery:
$('#trap').remove();
Spam bots won't execute the javascript and see the element, almost any browser will remove the element making it impossible to hit with tabbing to it
Edit: The easiest non-jQuery way would be:
<div id="trapParent"><a id="trap" href='/trap'><!-- nothing --></a></div>
And then remove it with javascript:
var parent = document.getElementById('trapParent');
var child = document.getElementById('trap');
parent.removeChild(child);
this solution seems to work well for me, luckily i have bookmarked it. I hope it helps you as well.
you can create a hidden link like this and put it at the very top left of your page and to prevent regular users from accessing it too easily you can use css to lay a logo image over this image.
<a href="/bottrap.php"><img src="images/pixel.gif" border="0" alt=" " width="1" height="1"></a>
if you are interested in setting up how to blacklist the bots refer to this link for detailed explaination of howto.
http://www.webmasterworld.com/apache/3202976.htm
精彩评论