Android - Generic XML Parser using SAXParser
I need to write an XML parsing class that I can reuse throughout my Android application and from what I've read a SAXParser is the best for a mobile application. I am using this guide:
http://www.jondev.net/articles/Android_XML_SAX_Parser_Example
And the type of document I wish to parse is a feed from the Blogger GData API - example would be:
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearch/1.1/' xmlns:gd='http://schemas.google.com/g/2005' gd:etag='W/"CUIGRnc4fyp7ImA9Wx9SEEg."'>
<id>tag:blogger.com,1999:user-464300745974.blogs</id>
<updated>2010-11-29T17:58:47.937Z</updated>
<title>Tim's Blogs</title>
<link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/blogs'/>
<link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/blogs'/>
<link rel='alternate' type='text/html' href='http://www.blogger.com/profile/blogid'/>
<author>
<name>Tim</name>
<uri>http://www.blogger.com/profile/blogid</uri>
<email>noreply@blogger.com</email>
</author>
<generator version='7.00' uri='http://www.blogger.com'>Blogger</generator>
<openSearch:totalResults>2</openSearch:totalResults>
<openSearch:startIndex>1</openSearch:startIndex>
<openSearch:itemsPerPage>25</openSearch:itemsPerPage>
<entry gd:etag='W/"DUIBQHg-cCp7ImA9Wx9TF0s."'>
<id>tag:blogger.com,1999:user-464300745974.blog-blogid</id>
<published>2010-06-22T10:59:38.603-07:00</published>
<updated>2010-11-26T02:32:31.658-08:00</updated>
<title>Application Testing Blog</title>
<summary type='html'>This blog is for testing the Android application.</summary>
<link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/blogs/blogid'/>
<link rel='alternate' type='text/html' href='http://devrum.blogspot.com/'/>
<link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://devrum.blogspot.com/feeds/posts/default'/>
<link rel='http://schemas.google.com/g/2005#post' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/posts/default'/>
<link rel='http://schemas.google.com/blogger/2008#template' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/template'/>
<link rel='http://schemas.google.com/blogger/2008#settings' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/settings'/>
<author>
<name>Tim</name>
<uri>http://www.blogger.com/profile/blogid</uri>
<email>noreply@blogger.com</email>
</author>
</entry>
<entry gd:etag='W/"C08HRXo4eSp7ImA9Wx9TE0o."'>
<id>tag:blogger.com,1999:user-464300745974.blog-515600026106499737</id>
<published>2010-06-22T10:59:00.328-07:00</published>
<updated>2010-11-21T12:37:14.431-08:00</updated>
<title>Development Blog</title>
<summary type='html'>etc</summary>
<link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/blogid/blogs/515600026106499737'/>
<link rel='alternate' type='text/html' href='http://rumdev.blogspot.com/'/>
<link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://rumdev.blogspot.com/feeds/posts/default'/>
<link rel='http://schemas.google.com/g/2005#post' type='application/atom+xml' href='http://www.blogger.com/feeds/515600026106499737/posts/default'/>
<link rel='http://schemas.google.com/blogger/2008#template' type='application/atom+xml' href='http://www.blogger.com/feeds/515600026106499开发者_如何学运维737/template'/>
<link rel='http://schemas.google.com/blogger/2008#settings' type='application/atom+xml' href='http://www.blogger.com/feeds/515600026106499737/settings'/>
<author>
<name>Tim</name>
<uri>http://www.blogger.com/etc</uri>
<email>noreply@blogger.com</email>
</author>
</entry>
</feed>
I need to parse the blog IDs and post IDs out of feeds like the above. From any example I find on SAX, they are not generic at all. I'd like to write a reusable one, do you have any examples how I can modify the SAXParser accordingly?
SAX parsers are event driven parsers. You write a handler for every TYPE of XML node (start element, end element, attribute, text, etc..) and then parser iterates XML document and SAX events are sent to you (= methodes in your handler get called).
In your particular case you are looking for two node sequences:
<feed><id>
<feed><entry><id>
So you have to store some state info inside your handler to know where you are. Here is the code (didn't try it myself, you'll have to debug it):
public class DataHandler extends DefaultHandler {
private Feed feed;
private Entry currentEntry;
private boolean isId = false;
public Feed getFeed() {
return feed;
}
@Override
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException {
if (localName.equals("feed")) {
feed = new Feed();
} else if (localName.equals("entry")) {
currentEntry = new Entry();
} else if (localName.equals("id")) {
isId = true;
}
}
@Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException {
if (localName.equals("feed")) {
// </feed> - do nothing, it's the end
} else if (localName.equals("entry")) {
// </entry> - save current entry and reset variable
feed.entries.add(currentEntry);
currentEntry = null;
} else if (localName.equals("id")) {
isId = false;
}
}
@Override
public void characters(char ch[], int start, int length) {
if(!isId) return;
String chars = new String(ch, start, length);
chars = chars.trim();
if (currentEntry != null) {
currentEntry.id = chars;
} else {
feed.id = chars;
}
}
private class Feed {
public String id;
public List<Entry> entries = new ArrayList<Entry>();
}
private class Entry {
public String id;
}
}
Try something along the lines of this:
public class XmlParser extends DefaultHandler{
private static final int STATE_FEED = 0;
private static final int STATE_ID = 1;
private static int sDepth = 0;
private static int sCurrentState = 0;
private String mTheString;
public XmlParser(){}
@Override
public void startDocument() throws SAXException{}
@Override
public void endDocument() throws SAXException{}
@Override
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException{
mTheString = "";
sDepth++;
if (qName.equals("feed")){
sCurrentState = STATE_FEED;
return;
}
if (qName.equals("id")){
sCurrentState = STATE_ID;
return;
}
sCurrentState = 0;
}
@Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException{
sDepth--;
switch (sCurrentState){
case STATE_FEED:
//Do something with the feed
sCurrentState = 0;
mTheString = "";
break;
case STATE_ID:
// Save the ID or do whatever
sCurrentState = 0;
mTheString = "";
break;
default:
//Do nothing
mTheString = "";
return;
}
mTheString = "";
}
@Override
public void characters(char ch[], int start, int length){
mTheString = mTheString + new String(ch, start, length);
}
}
You can access the custom SAXParser with something like this:
InputStream stream = //whatever your stream is (the document)
XmlParser handler = new XmlParser(); // your custom parser
XMLReader xmlreader = XMLReaderFactory.createXMLReader();
xmlreader.setContentHandler(handler);
xmlreader.parse(new InputSource(stream));
// Then you can create a method in the handler, like getResults to return the list of elements or something here.
So you pass your custom parser into the Xml Reader, and get the results from the source. During the Xml Parsing, the handler starts at "start document" then iterates through the elements in the xml (calling startElement at the start, endElement at the beginning). The characters method is called in between these two - picking up the characters (which you can then do whatever you want with in the endElement). The parser is finished when endDocument is called, so you can set things up and tear them down at the start and end of elements or the whole document if you wish.
Hope this helps, and is close to what you are looking to do.
精彩评论