How do you scrape ajax push sites
Take a look at any live auction on http:/开发者_Python百科/www.quibids.com/ I wanted to scrape Bid History which appears to be being updated by a javascript timer. When I inspect element in Chrome it auto-updates the source. Is there some way to do that with screen scraping? I'm using Ruby to do this if it matters. What I want to avoid is just hammering on that page every second.
You could use a browser engine that can execute the javascript, such as webkit (there's a scriptable wrapper for it, WebkitDriver).
Or check what the javascript timer is doing via a tool like firebug. Likely it is making an AJAX request to get the updated data and you can call these AJAX URL's directly.
精彩评论