开发者

Trouble parsing quotes with SAX parser (javax.xml.parsers.SAXParser) on Android API 1.5

When using a SAX parser, parsing fails when there is a " in the node content. How can I resolve this? Do I need to convert all " characters?

In other words, anytime I have a quote in a node:

 <node>characters in node containing "quotes"</node>

That node gets butchered into multiple character arrays when the Handler is parsing it. Is this normal behaviour? Why should quotes cause such a problem?

Here is the code I am using:

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;

 ...


HttpGet httpget = new HttpGet(GATEWAY_URL + "/"+ question.getId());
          httpget.setHeader("User-Agent", PayloadService.userAgent);
          httpget.setHeader("Content-Type", "application/xml");

          HttpResponse response = PayloadService.getHttpclient()开发者_开发技巧.execute(httpget);
          HttpEntity entity = response.getEntity();

          if(entity != null)
          {        
              SAXParserFactory spf = SAXParserFactory.newInstance();
              SAXParser sp = spf.newSAXParser();            
              XMLReader xr = sp.getXMLReader();            

              ConvoHandler convoHandler = new ConvoHandler();
              xr.setContentHandler(convoHandler);             
              xr.parse(new InputSource(entity.getContent()));                                


              entity.consumeContent();         

               messageList = convoHandler.getMessageList();


          }


The error is in your handler class referenced in your most recent comment.

A common error in writing a ContentHandler is to assume the characters method is only going to be called once with all the character data. It can in fact be called multiple times with chunks of the character data, which you have to collect. The chopping up into multiple character arrays is normal behavior.

Probably you need to initiate a collector (maybe a StringBuffer) in your startElement method, collect data into it in your characters method and then use the data in your endElement method, which should be where the message.setText shown in your comment is called.


Correct answer has already been given (wrt no guarantees in character data being sent as single event). One thing to consider is that perhaps using a parser with Stax (or xmlpull) "pull" interface would work better; there is a way to force Stax parser to ensure all char data is reported as single token (enable coalescing). Stax (or pull parsers in general) are considered bit more convenient use than SAX, and there are implementations that run on Android as well (android SDK even bundles xmlpull I think); Woodstox and Aalto should work.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜