开发者

Web crawler capable of interpreting Javascript in python for Windows

My ultimate goal is to build a web crawler capable of downloading all of the images on a webpage. My understanding from the reading I've done is that I need to embed a rendering/layout engine such as Gecko or Webkit.

Unfortunately, I'm running windows, so PyWebkit is out and short learning C++ for Gecko or Java to use Rhino, I'm not sure where to turn.

Is there a reliable rendering开发者_开发知识库 engine with python bindings that will work in windows (64-bit, Windows 7)? Is there an easy way to execute javascript within a python script on windows?


You don't need Webkit to do that. All you need it an engine to run Javascript code, so take a look at Gogole V8 or Mozilla SpiderMonkey.

If you're prefer Python to build your crawler, you may want to use PyV8 as it provides all necessary bindings.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜