开发者

Automated browsing of complicated web pages

I have a project that will involve heavy automation of complicated web pages.

I realize there are Mechanize and Beautiful Soup, but don't these break when dealing with large amounts of DOM scripting and other weird stuff you find on complicated web pages?

I think I want essentially a barebones running instance of WebKit that allows me to either do "GUI script开发者_C百科ing" or access the DOM. Ideas?


Try Sahi with PhantomJS. Sahi is a browser automation tool, and PhantomJS is a headless Webkit browser. You can find set-up instructions here: http://sahi.co.in/w/sahi-headless-execution-with-phantomjs

Disclaimer: We created the Sahi product.


What platform are you working on? And what language do you intend to use?

Adobe Air let's you embed a webkit inside an Air application and interact with the page JavaScript (there is two-way communication between the page JS and the AIR runtime).

Otherwise, if you are not bound to webkit you could take Mozilla Chromeless for a spin.

My apologies if none of this does what you need to do, I can't quite figure what exactly you are trying to do (page scraping? submitting forms?).


For testing/scraping i would try:

  • Selenium
  • EnvJS
  • Windmill
  • Watir
  • Sahi
  • WebTest
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜