Reading big chunk of xml data from socket and parse on the fly
I am working on an android client which reads continues stream of xml data from my java server via a TCP socket. The server sends a '\n' character as delimiter between consecutive responses. Below given is a model implementation..
<response1>
<datas>
<data>
.....
.....
</data>
<data>
.....
.....
</data>
........
........
</datas>
</response1>\n <--- \n acts as delimiter ---/>
<response2>
<datas>
<data>
.....
.....
</data>
<data>
.....
.....
</data>
........
........
</datas>
</response2>\n
Well I hope the structure is clear now. This response is transmitted from server zlib compressed. So I have to first inflate whatever I am reading from the server, separate on respons开发者_高级运维e using delimiter and parse. And I am using SAX to parse my XML
Now my main problem is the xml response coming from server can be very large (can be in the range of 3 to 4 MB). So
to separate responses based on delimiter (\n) I have to use a stringBuilder to store response blocks as it reads from socket and on some phones StringBuilder cannot store strings in the MegaBytes range. It is giving OutOfMemory exception, and from threads like this I got to know keeping large strings (even on a temporary basis) is not such a good idea.
Next I tried to pass the inflatorReadStream (which in turn takes data from socket input stream) as the input stream of SAX parser (without bothering to separate xml myself and relying on SAX's ability to find the end of document based on tags). This time one response gets parsed successfully, but then on finding the '\n' delimiter SAX throws ExpatParserParseException saying junk after document element .
- After catching that ExpatParserParseException I tried to read again, but after throwing exception SAX Parser closes the stream, so when I try to read/parse again, it is giving IOException saying input stream is closed.
A code snippet of what I have done is given below (removed all unrelated try catch blocks for clarity).
private Socket clientSocket = null;
DataInputStream readStream = null;
DataOutputStream writeStream = null;
private StringBuilder incompleteResponse = null;
private AppContext context = null;
public boolean connectToHost(String ipAddress, int port,AppContext myContext){
context = myContext;
website = site;
InetAddress serverAddr = null;
serverAddr = InetAddress.getByName(website.mIpAddress);
clientSocket = new Socket(serverAddr, port);
//If connected create a read and write Stream objects..
readStream = new DataInputStream(new InflaterInputStream(clientSocket.getInputStream()));
writeStream = new DataOutputStream(clientSocket.getOutputStream());
Thread readThread = new Thread(){
@Override
public void run(){
ReadFromSocket();
}
};
readThread.start();
return true;
}
public void ReadFromSocket(){
while(true){
InputSource xmlInputSource = new InputSource(readStream);
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = null;
XMLReader xr = null;
try{
sp = spf.newSAXParser();
xr = sp.getXMLReader();
ParseHandler xmlHandler = new ParseHandler(context.getSiteListArray().indexOf(website), context);
xr.setContentHandler(xmlHandler);
xr.parse(xmlInputSource);
// postSuccessfullParsingNotification();
}catch(SAXException e){
e.printStackTrace();
postSuccessfullParsingNotification();
}catch(ParserConfigurationException e){
e.printStackTrace();
postSocketDisconnectionBroadcast();
break;
}catch (IOException e){
postSocketDisconnectionBroadcast();
e.printStackTrace();
e.toString();
break;
}catch (Exception e){
postSocketDisconnectionBroadcast();
e.printStackTrace();
break;
}
}
}
And now my questions are
- Is there any way to make SAX Parser ignore junk characters after on xml response, and not throw exception and close the stream..
- If not is there any way to avoid out of memory error on stringBuilder. To be frank,I am not excepting a positive answer on this. Any workaround?
- You might be able to use a wrapper around the reader or stream you pass to the filter that detects the newline and then closes the parser and launches a new parser that continues with the stream: your stream is NOT valid XML and you won't be able to parse it as you currently have implemented. Take a look at http://commons.apache.org/io/api-release/org/apache/commons/io/input/CloseShieldInputStream.html.
- No.
If your SAX parser supports a push model (where you push raw data chunks into it yourself and it fires events as it parses the raw data), then you can simply push your own initial XML tag at the beginning of the SAX session. That will become the top-level document tag, then you can push the responses as you receive them and they will be second-level tags as far as SAX is concerned. That way, you can push multiple responses in the same SAX session, and then in the OnTagOpen event (or wheatever you are using), you will know when a new response begins when you detect its tag name at level 1.
精彩评论