开发者

Get web page content to String is very slow

I did the download a web page with the HttpURLConnection.getInputStream() and to开发者_开发技巧 get the content to a String, i do the following method:

String content="";
isr = new InputStreamReader(pageContent);
br = new BufferedReader(isr);
try {
    do {
            line = br.readLine();
            content += line;
        } while (line != null);
        return content;
    } catch (Exception e) {
        System.out.println("Error: " + e);
        return null;
    }

The download of the page is fast, but the processing to get the content to String is very slow. There is another way faster to get the content to a String?

I transform it to String to insert in the database.


Read into buffer by number of bytes, not something arbitrary like lines. That alone should be a good start to speeding this up, as the reader will not have to find the line end.


Use a StringBuffer instead.

Edit for an example:

StringBuffer buffer=new StringBuffer();

for(int i=0;i<20;++i)
  buffer.append(i.toString());

String result=buffer.toString();


use the blob/clob to put the content directly into database. any specific reason for buliding string line by line and put it in the database??


I'm using jsoup to get specified content of a page and here is a web demo based on jquery and jsoup to catch any content of a web page, you should specify the ID or Class for the page content you need to catch: http://www.gbin1.com/technology/democenter/20120720jsoupjquerysnatchpage/index.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜