开发者

Is it possible to do HTML scraping , data mining through Python?

Can I gather intelligent data , HTML scraping using python? I have no knowledge of it , so I woul开发者_运维问答d like to get some idea.


Look at the module scrapy:

http://scrapy.org/


You certainly can - I developed this library in Python for my web scraping work.

A good parsing library is lxml.

If you are new to Python you may want to work through this ebook first.


Try using urllib2 and Beautiful Soup.

urllib2 is useful for requesting URLs programmatically. It's part of the standard library: http://docs.python.org/library/urllib2

Beautiful Soup is good for mining HTML/XML and can be found here: http://pypi.python.org/pypi/BeautifulSoup


You may also use htql library at: http://htql.net.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜