开发者

SEO - How to avoid search engine crawlers to not read entire URLs

I have about 7 query-string parameters in my URL :

http://www.examplesitname.com/EN/en/tshirt-jeans.aspx?productid=324175730&documentid=295110&producttitle=Pyjama+Tshirt&categoryid=55479572&source=TreeStructureNavigation&numberpage=1&pos=TG_n_n

If I break it down following are the query string parameters :

productid

documentid

producttitle

categoryid

source

numberpage

pos

Out of these I need to only display productId and documentId to the search engine, what is the best approach to achieve this?

I could accommodate one more query string parameter named "extendedattributes" which would contain a comma seperated list of remaining parameters which I could separate back in the r开发者_如何学Pythonequest and create a response accordingly, but is that a good way to achieve this ? Is there any other better way ?

Thanks


Google Webmaster Tools will let you designate URL-string parameters to ignore or not ignore when they index your site. (Look under "Site Configuration" and then "Settings.") Doesn't help you with other crawlers, of course, so this is only a partial solution.


First thing that comes to my mind: # the rest of parameters as follows. And then use JavaScript/Ajax to retrieve rest of the parameter and load content accordingly. However, this method may require design changes as anything after # does not reach to the web server.

http://www.examplesitname.com/EN/en/tshirt-jeans.aspx?productid=324175730&documentid=295110#producttitle=Pyjama+Tshirt&categoryid=55479572&source=TreeStructureNavigation&numberpage=1&pos=TG_n_n


Use robots.txt or other techniques to remove all alternatives and add to a sitemap only the urls you need. Search engines will only index those you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜