开发者

Missing and unexpected chars in reading a large input stream using BufferedInputStream in java

I have to do read an Large InputStream comming from a URL. I loaded the InputStream to the BufferedInputStream and read it to a 开发者_JS百科byte[ ] and I append that byte [] to a StringBuilder converting it to a string. After all data has been appended to the StringBuilder, the resulting String contains some missing and unexpected chars. I didn't use any encoding (Eg. UTF-8) here since the response is coming in the similar format I expected.

Can you give any suggestions to solve this?

Code :

    BufferedInputStream brIn    = new BufferedInputStream(connection.getInputStream());
    StringBuilder response      = new StringBuilder(1000);

    byte[] byteBfr  = new byte[8192];
    int n=0;

    while((n=brIn.read(byteBfr,0,byteBfr.length)) != -1){
        response.append(new String(byteBfr).toCharArray(),0,n);
    }

    return  response.toString();

Output : This is a part of the resulting response. The complete one contains about 554595 lines.

Expected Result :

  <Hotel>
    <CiID>31</CiID>
    <HoID>58617</HoID>
    <Name>HARRY΄S</Name>
    <Address>PROTARAS</Address>
    <Phone>00357 23 834100</Phone>
    <Fax>0035723831860</Fax>
    <Stars>3</Stars>
  </Hotel>

Actual Result :

  <Hotel>
    <CiID>31</CiID>
    <HoID>58617</HoID>
    <Name>HARRY΄S</Name>
    <Address>PROTARAS</AdAdress>
 <   <Phone>00357 23 834100</Phone>
    <Fa9x>00390<P654224546</Fax>
    <Stars>3</Stars>
  </Hotel>

In the above one you can see the unexpected chars in the Address, Fax and in the Phone.


Since you're reading in the entire string at once (as opposed to processing it as it arrives), consider using a BufferedReader.

import java.io.*;
import java.net.*;

public class UrlReading {
  public static void main(String[] args) throws Exception {
    URL url = new URL("http://google.com");
    BufferedReader reader = new BufferedReader(
            new InputStreamReader(url.openConnection().getInputStream(), "UTF-8"));
    String inputLine;
    while( (inputLine = reader.readLine()) != null) {
      System.out.println(reader.readLine());
    }
  }
}

Alternately, if you're reading in xml, consider using a solution that will let you parse the xml, like:

Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("http://google.com");
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜