开发者

Fetch a Web Page and save in Database?

How can i fetch a HTML page and save it to my database in JAVA?is 开发者_如何学Cthere any easy way to do that?


Receiving a file over http is pretty easy using the URL class:

String rawHtml = IOUtils.toString(new URL("http://yahoo.com").openStream());

IOUtils is taken from org.apache.commons.io, the toString method reads the whole input stream into one String. Unfortunately by using java.net.URL you cannot control anything (cookies, header information, ..) besides the website's address :-/ Personally, I use this approach wherever I can since the HttpClient's API is too complex (too many LOC) to simply retrieve the source code of a website.


Not sure about your exact requirements.

For something simple you can use HttpClient

For something more complex, you can use Nutch It does crawling, indexing and searching as well.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜