SEO and AJAX (Twitter-style)
Okay, so I'm trying to figure something out. I am in the planing stages of a site, and I want to implement "fetch data on scroll" via JQuery, much like Facebook and Twitter, so that I don't pull all data from the DB at once.
But I some problems regarding the SEO, how will Go开发者_运维技巧ogle be able to see all the data? Because the page will fetch more data automatically when the user scrolls, I can't include any links in the style of "go to page 2", I want Google to just index that one page.
Any ideas for a simple and clever solution?
Put links to page 2 in place.
Use JavaScript to remove them if you detect that your autoloading code is going to work.
Progressive enhancement is simply good practise.
You could use PHP (or another server-side script) to detect the user agent of webcrawlers you specifically want to target such as Googlebot.
In the case of a webcrawler, you would have to use non-JavaScript-based techniques to pull down the database content and layout the page. I would recommended not paginating the search-engine targeted content - assuming that you are not paginating the "human" version. The URLs discovered by the webcrawler should be the same as those your (human) visitors will visit. In my opinion, the page should only deviate from the "human" version by having more content pulled from the DB in one go.
A list of webcrawlers and their user agents (including Google's) is here:
http://www.useragentstring.com/pages/Crawlerlist/
And yes, as stated by others, don't reply on JavaScript for content you want see in search engines. In fact, it is quite frequently use where a developer doesn't something to appear in search engines.
All of this comes with the rider that it assumes you are not paginating at all. If you are, then you should use a server-side script to paginate you pages so that they are picked up by search engines. Also, remember to put sensible limits on the amout of your DB that you pull for the search engine. You don't want it to timeout before it gets the page.
Create a Google webmaster tools account, generate a sitemap for your site (manually, automatically or with a cronjob - whatever suits) and tell Google webmaster tools about it. Update the sitemap as your site gets new content. Google will crawl this and index your site.
The sitemap will ensure that all your content is discoverable, not just the stuff that happens to be on the homepage when the googlebot visits.
Given that your question is primarily about SEO, I'd urge you to read this post from Jeff Atwood about the importance of sitemaps for Stackoverflow and the effect it had on traffic from Google.
You should also add paginated links that get hidden by your stylesheet and are a fallback for when your endless-scroll is disabled by someone not using javascript. If you're building the site right, these will just be partials that your endless scroll loads anyway, so it's a no-brainer to make sure they're on the page.
精彩评论