开发者

Issue with XML parsing using Commons JXPath

I'm trying to parse a XML using Apache Commons JXPath. But for some reason, its not able to identify the child nodes after the xml is being parsed. Here's the sample code :

private static void processUrl(String seed){
    String test = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><feed xmlns=\"http://www.w3.org/2005/Atom\" xmlns:media=\"http://search.yahoo.com/mrss/\" xmlns:openSearch=\"http://a9.com/-/spec/opensearchrss/1.0/\" xmlns:gd=\"http://schemas.google.com/g/2005\" xmlns:yt=\"http://gdata.youtube.com/schemas/2007\"><id>http://gdata.youtube.com/feeds/api/videos</id><logo>http://www.youtube.com/img/pic_youtubelogo_123x63.gif</logo><link rel=\"alterna开发者_如何学Cte\" type=\"text/html\" href=\"http://www.youtube.com\"/><author><name>YouTube</name><uri>http://www.youtube.com/</uri></author><generator version=\"2.1\" uri=\"http://gdata.youtube.com\">YouTube data API</generator><openSearch:totalResults>144</openSearch:totalResults><entry><id>http://gdata.youtube.com/feeds/api/videos/P1lDDu9L5YQ</id><published>2010-09-20T17:41:38.000Z</published><updated>2011-09-18T22:15:38.000Z</updated><category scheme=\"http://schemas.google.com/g/2005#kind\" term=\"http://gdata.youtube.com/schemas/2007#video\"/><link rel=\"alternate\" type=\"text/html\" href=\"http://www.youtube.com/watch?v=P1lDDu9L5YQ&amp;feature=youtube_gdata\"/></entry></feed>";
    Document doc = null;
    try{
        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        ByteArrayInputStream bais = new ByteArrayInputStream(test.toString().getBytes("UTF8"));
        doc = builder.parse(bais);
        bais.close();

        JXPathContext ctx = JXPathContext.newContext(doc);
        List entryNodes = ctx.selectNodes("/feed/entry");
        System.out.println("number of threadNodes " + entryNodes.size());
        int totalThreads = 0;
        for (Object each : entryNodes) {
            totalThreads++;
            Node eachEntryNode = (Node) each;
            JXPathContext msgCtx = JXPathContext.newContext(eachEntryNode);
            String title = (String) msgCtx.getValue("title");
        }
    }catch (Exception ex) {
        ex.printStackTrace();
    }
}

I've used JXPath earlier and never had any issues. I debugged the document object,it doesn't seemed to have the child node () for . All I'm able to see is the root element. I also tried DOMParser without any luck.

DOMParser parser = new DOMParser();
        Document doc = (Document) parser.parseXML(new ByteArrayInputStream(sb0.toString().getBytes("UTF-8")));

I'll appreciate if someone can provide pointers to this isuse.


This issue has to do with how JXPath handles default namespaces, which closely follows the XPath 1.0 specification. This also explains why it worked after you removed the default namespace http://www.w3.org/2005/Atom. In order to get it to work with the default namespace you can do the following:

JXPathContext ctx = JXPathContext.newContext(doc.getDocumentElement());
// Register the default namespace, giving it a prefix of your choice
ctx.registerNamespace("myfeed", "http://www.w3.org/2005/Atom");

// Now query for entry elements using the registered prefix
List entryNodes = ctx.selectNodes("myfeed:entry");

For more information on the issue see the following links.

http://markmail.org/message/7iqw4bjrkwerbh46

Make jxpath namespace aware

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜