开发者

best java Xml parser to manipulate/edit an existing xml document

TASK : I have an existing xml document (UTF-8) which uses xml namespaces and xml schema. I need to parse to a particular element, append content (that also needs to use xml namespace prefixes) to this element and then write out the Document again.

which is the best XML parser library that I should be using for this TASK ?

I've seen a previous thread (Best XML parser for Java) but was not sure if dom4j or JDOM is any good for namespaces/xmlSchema and good support for UTF-8 characters.

Some parsers that seems like a task for

JDom

Dom4J

XOM

WoodStock

Any idea which one is the best ?开发者_运维技巧 :-) I use JDK 6 and would prefer NOT to use the built-in SAX/DOM facilities to do this job because that requires me to write too much code.

Would help to have some examples of doing such a task.


Using JDOM, taking an InputStream and making it a Document:

InputStream inputStream = (InputStream)httpURLConnection.getContent();
DocumentBuilderFactory docbf = DocumentBuilderFactory.newInstance();
docbf.setNamespaceAware(true);
DocumentBuilder docbuilder = docbf.newDocumentBuilder();
Document document = docbuilder.parse(inputStream, baseUrl);

At that point, you have the XML in a Java object. Done. Easy.

You can either use the document object and the Java API to just walk through it, or also use XPath, which I find easier (once I learned it).

Build an XPath object, which takes a bit:

public static XPath buildXPath() {
    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    xpath.setNamespaceContext(new AtomNamespaceContext());
    return xpath;
}


public class AtomNamespaceContext implements NamespaceContext {

    public String getNamespaceURI(String prefix) {
        if (prefix == null)
            throw new NullPointerException("Null prefix");
        else if ("a".equals(prefix))
            return "http://www.w3.org/2005/Atom";
        else if ("app".equals(prefix))
            return "http://www.w3.org/2007/app";
        else if ("os".equals(prefix))
            return "http://a9.com/-/spec/opensearch/1.1/";
        else if ("x".equals(prefix)) 
            return "http://www.w3.org/1999/xhtml";
        else if ("xml".equals(prefix))
            return XMLConstants.XML_NS_URI;
        return XMLConstants.NULL_NS_URI;
    }

    // This method isn't necessary for XPath processing.
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    // This method isn't necessary for XPath processing either.
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }
}

Then just use it, which (thankfully) doesn't take much time at all:

return Integer.parseInt(xpath.evaluate("/a:feed/os:totalResults/text()", document));


Use XSLT. Seriously. This is a perfect job for it. Just use a copy template to copy everything as is except for the place where you need to add more xml. You can even add the XML by actually writing XML instead of DOM manipulation.

This is the copy template:

<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

I know a lot of people hate XSLT, but this is a task where it would really shine and take almost no code. Also, you could just use what's in the JDK.


Since writing too much code is the main issue for you, you might want to consider jOOX:

http://code.google.com/p/joox/

I have created jOOX to be a port of jQuery to Java. The underlying technology is Java's standard DOM. Some sample code:

// Find the order at index for and add an element "paid"
$(document).find("orders").children().eq(4)
           .append("<paid>true</paid>");

// Find those orders that are paid and flag them as "settled"
$(document).find("orders").children().find("paid")
           .after("<settled>true</settled>");

// Add a complex element
$(document).find("orders").append(
  $("order", $("date", "2011-08-14"),
             $("amount", "155"),
             $("paid", "false"),
             $("settled", "false")).attr("id", "13");

Note: Namespaces are not yet explicitly supported, but you can work around that


It sounds like you can write an xslt style sheet to do what you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜