开发者

Merging Documents while preserving xsi:type

I have 2 Document objects with documents that contain similiar XML's. For example:

<tt:root xmlns:tt="http://myurl.com/">
  <tt:child/>
  <tt:child/>
</tt:root>

And the other one:

<ns1:root xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:child/>
  <ns1:child xsi:type="ns2:SomeType"/>
</ns1:root>

I need to merge them to 1 document with 1 root element and 4 child elements. Problem is, if I use document.importNode function to do the merging, it properly handles th开发者_开发问答e namespaces everywhere BUT xsi:type element. So what I'm getting in result is this:

<tt:root xmlns:tt="http://myurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tt:child/>
  <tt:child/>
  <ns1:child xmlns:ns1="http://myurl.com/"/>
  <ns1:child xmlns:ns1="http://myurl.com/" xsi:type="ns2:SomeType"/>
</tt:root>

As you can see, ns2 is used in xsi:type but is not defined anywhere. Is there any automated way to solve this problem?

Thanks.

ADDED:

If this task is impossible to complete using the default java DOM libraries, maybe there is some other library I can use to complete my task?


If I fix up the Namespace problem in your second file (by binding the "xsi" prefix), and do the merge using the code below the namespace bindings are preserved on the output; or at least they are here (vanilla Java 64-bit on Windows build 1.6.0_24).

String s1 = "<!-- 1st XML document here -->";
String s2 = "<!-- 2nd XML document here -->";

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware( true );
DocumentBuilder builder = factory.newDocumentBuilder();

Document doc1 = builder.parse( new ByteArrayInputStream( s1.getBytes() ) );
Document doc2 = builder.parse( new ByteArrayInputStream( s2.getBytes() ) );

Element doc1root = ( Element )doc1.getDocumentElement();
Element doc2root = ( Element )doc2.getDocumentElement();

NamedNodeMap atts1 = doc1root.getAttributes();
NamedNodeMap atts2 = doc2root.getAttributes();

for( int i = 0; i < atts1.getLength(); i++ )
{
    String name = atts1.item( i ).getNodeName();
    if( name.startsWith( "xmlns:" ) )
    {
        if( atts2.getNamedItem( name ) == null )
        {
            doc2root.setAttribute( name, atts1.item( i ).getNodeValue() );
        }    
    }    
}

NodeList nl = doc1.getDocumentElement().getChildNodes();
for( int i = 0; i < nl.getLength(); i++ )
{
    Node n = nl.item( i );
    doc2root.appendChild( doc2.importNode( n, true ) );

}

TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
StreamResult streamResult = new StreamResult( System.out );
transformer.transform( new DOMSource( doc2 ), streamResult );


The problem here is the use of namespace prefixes in attribute values; something that was never considered when the namespace standard was created, and something that the common Java DOM/XML tools cannot easily handle. However, you could solve it by

  1. Before merging, replace every instance of xsi:type="prefix:value" with xsi:type="{namespace}value". By doing this, you are not dependent on the prefix mapping. In your example, <xsi:type="ns2:SomeType" would become xsi:type="{http://myotherurl.com/}SomeType".
  2. Merge the documents.
  3. On the result document, reverse the replacement in step 1. The prefix mappings have to be carefully managed to avoid collisions; possibly a new mapping has to be created.


A single-line of XQuery could do the job: construct a new node named as the context root element, then import its children together with those from the other document:

declare variable $other external; element {node-name(*)} {*/*, $other/*/*}

Though in XQuery you don't have full control over namespace nodes (at least in XQuery 1.0), it has a copy-namespaces mode setting that can be used to ask for keeping the namespace context intact, in case the implementation does preserve it by default.

If XQuery is a viable option, then saxon9he.jar could be the "magic xml library" that you are after.

Here is sample code exposing some context, using the s9api API:

import javax.xml.parsers.DocumentBuilderFactory;
import net.sf.saxon.s9api.*;
import org.w3c.dom.Document;

...

  Document merge(Document context, Document other) throws Exception
  {
    Processor processor = new Processor(false);
    XQueryExecutable executable = processor.newXQueryCompiler().compile(
      "declare variable $other external; element {node-name(*)} {*/*, $other/*/*}");
    XQueryEvaluator evaluator = executable.load();    
    DocumentBuilder db = processor.newDocumentBuilder();
    evaluator.setContextItem(db.wrap(context));
    evaluator.setExternalVariable(new QName("other"), db.wrap(other));
    Document doc =
      DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
    processor.writeXdmValue(evaluator.evaluate(), new DOMDestination(doc));
    return doc;
  }


I would take JAXB and the Mergeable plugin to generate mergeFrom methods in schema-derived classes. Then:

  • Unmarshal o1, o2
  • Marge o1, o2 using the generated methods into o3
  • Marshal o3

JAXB normally handles xsi:type quite allright.


UPDATE

This will not work for the case where the two documents has colliding namespace prefixes (the mapping from the second document will replace the mapping from from the first).

You could copy the namespace declarations from the second document to the imported nodes. Since child nodes can override a parent nodes prefix this is valid:

<foo:root xmlns:foo="urn:ROOT">
    <foo:child xmlns:foo="urn:CHILD" xsi:type="foo:child-type">
       ...
    </foo:child>
</foo:root>

In the above XML the namespace bound to the prefix "foo" is overridden in the scope of the child element. You can accomplish this for your use case by doing the following:

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class Demo {

    public static void main(String[] args) throws Exception  {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();

        File file1 = new File("src/forum231/input1.xml");
        Document doc1 = db.parse(file1);
        Element rootElement1 = doc1.getDocumentElement();

        File file2 = new File("src/forum231/input2.xml");
        Document doc2 = db.parse(file2);
        Element rootElement2 = doc2.getDocumentElement();

        // Copy Child Nodes
        NodeList childNodes2 = rootElement2.getChildNodes();
        for(int x=0; x<childNodes2.getLength(); x++) {
            Node importedNode = doc1.importNode(childNodes2.item(x), true);
            if(importedNode.getNodeType() == Node.ELEMENT_NODE) {
                Element importedElement = (Element) importedNode;
                // Copy Attributes
                NamedNodeMap namedNodeMap2 = rootElement2.getAttributes();
                for(int y=0; y<namedNodeMap2.getLength(); y++) {
                    Attr importedAttr = (Attr) doc1.importNode(namedNodeMap2.item(y), true);
                    importedElement.setAttributeNodeNS(importedAttr);
                }
            }
            rootElement1.appendChild(importedNode);
        }

        // Output Document
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer t = tf.newTransformer();
        DOMSource source = new DOMSource(doc1);
        StreamResult result = new StreamResult(System.out);
        t.transform(source, result);
    }

}

Output

<?xml version="1.0" encoding="UTF-8" standalone="no"?><tt:root xmlns:tt="http://myurl.com/">
  <tt:child/>
  <tt:child/>

  <ns1:child xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
  <ns1:child xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:SomeType"/>
</tt:root>

ORIGINAL ANSWER

In addition to copying the elements, you could copy the attributes. This will ensure that the resulting document contains the necessary namespace declarations:

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class Demo {

    public static void main(String[] args) throws Exception  {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();

        File file1 = new File("input1.xml");
        Document doc1 = db.parse(file1);
        Element rootElement1 = doc1.getDocumentElement();

        File file2 = new File("input2.xml");
        Document doc2 = db.parse(file2);
        Element rootElement2 = doc2.getDocumentElement();

        // Copy Attributes
        NamedNodeMap namedNodeMap2 = rootElement2.getAttributes();
        for(int x=0; x<namedNodeMap2.getLength(); x++) {
            Attr importedNode = (Attr) doc1.importNode(namedNodeMap2.item(x), true);
            rootElement1.setAttributeNodeNS(importedNode);
        }

        // Copy Child Nodes
        NodeList childNodes2 = rootElement2.getChildNodes();
        for(int x=0; x<childNodes2.getLength(); x++) {
            Node importedNode = doc1.importNode(childNodes2.item(x), true);
            rootElement1.appendChild(importedNode);
        }

        // Output Document
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer t = tf.newTransformer();
        DOMSource source = new DOMSource(doc1);
        StreamResult result = new StreamResult(System.out);
        t.transform(source, result);
    }

}

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt:root xmlns:tt="http://myurl.com/" xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tt:child/>
  <tt:child/>

  <ns1:child/>
  <ns1:child xsi:type="ns2:SomeType"/>
</tt:root>


If you know the namespace URI and prefix URI that you want to add it can be as easy as simply adding an attribute to an element. This worked for me when my merged document was missing xmlns:xsd="http://www.w3.org/2001/XMLSchema" contained in my imported document:

    myDocument.getDocumentElement.setAttribute("xmlns:xsd", "http://www.w3.org/2001/XMLSchema");
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜