SEO - How to avoid search engine crawlers to not read entire URLs
I have about 7 query-string parameters in my URL :
http://www.examplesitname.com/EN/en/tshirt-jeans.aspx?productid=324175730&documentid=295110&producttitle=Pyjama+Tshirt&categoryid=55479572&source=TreeStructureNavigation&numberpage=1&pos=TG_n_n
If I break it down following are the query string parameters :
productid documentid producttitle categoryid source numberpage posOut of these I need to only display productId and documentId to the search engine, what is the best approach to achieve this?
I could accommodate one more query string parameter named "extendedattributes" which would contain a comma seperated list of remaining parameters which I could separate back in the r开发者_如何学Pythonequest and create a response accordingly, but is that a good way to achieve this ? Is there any other better way ?
Thanks
Google Webmaster Tools will let you designate URL-string parameters to ignore or not ignore when they index your site. (Look under "Site Configuration" and then "Settings.") Doesn't help you with other crawlers, of course, so this is only a partial solution.
First thing that comes to my mind: # the rest of parameters as follows. And then use JavaScript/Ajax to retrieve rest of the parameter and load content accordingly. However, this method may require design changes as anything after # does not reach to the web server.
http://www.examplesitname.com/EN/en/tshirt-jeans.aspx?productid=324175730&documentid=295110#producttitle=Pyjama+Tshirt&categoryid=55479572&source=TreeStructureNavigation&numberpage=1&pos=TG_n_n
Use robots.txt or other techniques to remove all alternatives and add to a sitemap only the urls you need. Search engines will only index those you want.
精彩评论