Block RewriteRule in robots.txt
Here is an example RewriteRule from my .htaccess file:
RewriteRule ^ABC$ index.php?partner_id=123&utm_source=partner&utm_medium=link&utm_campaign=ABC [L]
So http://mywebsite.开发者_如何学编程com/123
would point to index.php?partner_id=123&utm_source=partner&utm_medium=link&utm_campaign=ABC
Index.php file is a very important page to be properly indexed by search engines, but I would like to block http://mywebsite.com/123
from being indexed without affecting http://mywebsite.com/
or http://mywebsite.com/index.php
from being indexed.
Any help would be great.
If you want to block http://mywebsite.com/123
, but allow http://mywebsite.com/123index.php
, then you need an Allow
and a Disallow
:
User-agent: *
Allow: /123index.php
Disallow: /123
This will disallow anything that starts with /123
, but specifically allow /123index.php
.
Standard robots.txt syntax doesn't let you disallow specific URLs. Rather, it disallows URLs that start with the pattern that you specify.
Google and Bing (and some others) have some extensions to the standard syntax. Using Google's $
wildcard support, you could write:
Disallow: /123$
And that would block just that one URL. Other crawlers might or might not support that syntax.
Note in response to comment:
If I understand correctly, after your comment, you want to allow http://mywebsite.com/index.php
, but block http://mywebsite.com/123
. If you know there are no other resources that start with /123
, then you can write:
Disallow: /123
That will block anything that starts with /123
. For example, /123/file.html
and /123abc
. If there are other resources that start with /123
and you want to allow them, then you'll need:
Disallow: /123$
But understand that Google and maybe Bing will respect that wildcard. Many other crawlers won't.
精彩评论