开发者

How can I get html content from a browser that can do the html correction and js scripting?

I need a solution for getting HTML content from a browser. As rendering in 开发者_Python百科a browser, js will be ran, and if not, js won't be ran. So any html libraries like lxml, beautifulsoup and others are all not gonna work. I've searched a project named pywebkitgtk, but it's purpose is to create a browser with a front end. Is there any way to put a url into a "fake browser" and render it and run its all javascript and save it into a html file? I don't need any front-end, just back-end is ok.

I need to use Python or java to do that.


selenium-rc lets you drive an actual browser for your purpose, under control of any of several languages at your choice, which include both Python and Java. Check it out!

For a detailed example of use with Python, see here.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜