开发者

Processing RSS Feeds with Namespaces in Android

I'm trying to write an XML parser that takes an RSS feed & fetches the image urls shown in the url attribute of the <media:thumbnail> tag. This is all being done via android.Util.Xml, & is an adaptation of the code shown here. An example RSS feed that I'm trying to use is the BBC News RSS feed.

However, media is an additional namespace & (probably) as a result my parser isn't working as it should.

A version of my parse method is below. Is there any (no doubt simple) way to get my list of image URLs working?

public List<string> parse() {
    URL feedUrl = new URL("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml");

    InputStream feedStream;

    try {
        feedStream = feedUrl.openConnection().getInputStream();
    } catch (IOException e) {
        throw new 开发者_如何学运维RuntimeException(e);
    }              

    final List<string> ret = new ArrayList<string>();

    RootElement root = new RootElement("rss");
    Element channel = root.getChild("channel");
    Element item = channel.getChild("item");

    item.getChild("media", "thumbnail").getChild("url").setEndTextElementListener(new EndTextElementListener() {
        public void end(String body) {
            ret.add(body);
        }
    });

    try {
        Xml.parse(feedStream, Xml.Encoding.UTF_8, root.getContentHandler());
    } catch (Exception e) {
        throw new RuntimeException(e);
    }

    return ret;
}


One way I found that the Xml parser (on Froyo 2.2) works with namespace prefixes is by specifying the namespace URL as the first parameter to your item.getChild() call. For example, if your xml looks like this, you code can use the xmlns url as the first parameter.

<?xml version="1.0" encoding="utf-8"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sample="http://www.example_web_site_here.com/dtds/sample_schema.dtd" version="2.0">
    <channel><item><sample:duration>1:00:00</sample:duration></item></channel></rss>

Your listener setup would look like this to get the duration element text:

 item.getChild("http://www.example_web_site_here.com/dtds/sample_schema.dtd", "duration").setEndTextElementListener(new EndTextElementListener(){
            public void end(String body) {
                this.itemDuration = body;
            } });

It requires knowledge of the namespace, but it has been working for me. In my case, I know the name space.


In as far as I can tell the "android" SAX parser has no support for namespace (xmlns) nesting (despite the rootelement object specifically mentioning namespace), the stripped "J2SE" SAX parser is also crippled in the way, and the DOM parser is weighty, but operational.

I use DOM with XML namespaces with success, but would prefer a SAX solution that did not involve adding a working XML library like JDOM to my packages.


I would not recommend trying to implement your own RSS parser , but rather using a standard library for that.

You need to cater to all formats RSS 1, RSS 2, Atom etc. Even then you will have to contend with poorly formatted feeds.

I had faced similar problems in the past so decided to do my feed parsing on a server and just get the parsed contents. This allows me to run more complex libraries and parser which I can modify without pushing out updates for my app. You must really aim at keeping your app light weight and pushing as much logic out of it as you can (to your own backend server).

I have the following service running on AppEngine which allows for a much simpler XML / JSON parsing at your end. There is a fixed and simple structure to the response. You can use this for parsing

http://evecal.appspot.com/feedParser

You can send both POST and GET requests with the following parameters.

feedLink : The URL of the RSS feed response : JSON or XML as the response format

Examples:

For a POST request

curl --data-urlencode "feedLink=http://feeds.bbci.co.uk/news/world/rss.xml" --data-urlencode "response=json" http://evecal.appspot.com/feedParser

For GET request

evecal.appspot.com/feedParser?feedLink=http://feeds.nytimes.com/nyt/rss/HomePage&response=xml

My android app "NewsSpeak" uses this too.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜