Fetch a Web Page and save in Database?
How can i fetch a HTML page and save it to my database in JAVA?is 开发者_如何学Cthere any easy way to do that?
Receiving a file over http is pretty easy using the URL class:
String rawHtml = IOUtils.toString(new URL("http://yahoo.com").openStream());
IOUtils is taken from org.apache.commons.io, the toString method reads the whole input stream into one String. Unfortunately by using java.net.URL you cannot control anything (cookies, header information, ..) besides the website's address :-/ Personally, I use this approach wherever I can since the HttpClient's API is too complex (too many LOC) to simply retrieve the source code of a website.
Not sure about your exact requirements.
For something simple you can use HttpClient
For something more complex, you can use Nutch It does crawling, indexing and searching as well.
精彩评论