Making the nodes to ignore namespaces (prefixes) after changing XML structure. PHP DOMDocument
Original XML (myfile.xml)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<blabla
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmln开发者_开发知识库s:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:blabla="http://www.w3.org/2000/blabla"
xmlns="http://www.w3.org/2000/blabla"
version="1.0">
<title>Hello there</title>
<metadata>
<rdf:RDF>
<cc:whtaat />
</rdf:RDF>
</metadata>
<sometag>
<anothertag id="anothertag1111">
<andanother id="yep" />
</anothertag >
</sometag>
</blabla>
The aim is adding a child straight under the document root node and "pushing" the "original" children under the new child:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<blabla
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:blabla="http://www.w3.org/2000/blabla"
xmlns="http://www.w3.org/2000/blabla"
version="1.0">
<magic>
<title>Hello there</title>
<metadata>
<rdf:RDF>
<cc:whtaat />
</rdf:RDF>
</metadata>
<sometag>
<anothertag id="anothertag1111">
<andanother id="yep" />
</anothertag >
</sometag>
</magic>
</blabla>
This php script does that
<?php
header("Content-type: text/xml");
// Create dom document
$doc = new DOMDocument();
$doc->load("myfile.xml");
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
// Get first child (blabla)
$blablaNode = $doc->firstChild;
// Crete magic element to hold all children in blabla
$magicElement = $doc->createElement('magic');
while($blablaNode->hasChildNodes()) {
// Remove child from blablaNode and append it into magicElement
$magicElement->appendChild($blablaNode->removeChild($blablaNode->firstChild));
}
// Append magicElement to blablaNode
$magicElement = $blablaNode->appendChild($magicElement);
echo $doc->saveXML();
?>
however the output is
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<blabla xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:blabla="http://www.w3.org/2000/blabla"
xmlns="http://www.w3.org/2000/blabla" version="1.0">
<magic>
<blabla:title xmlns:default="http://www.w3.org/2000/blabla">Hello there</blabla:title>
<blabla:metadata xmlns:default="http://www.w3.org/2000/blabla" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cc="http://creativecommons.org/ns#">
<rdf:RDF>
<cc:whtaat/>
</rdf:RDF>
</blabla:metadata>
<blabla:sometag xmlns:default="http://www.w3.org/2000/blabla">
<blabla:anothertag id="anothertag1111">
<blabla:andanother id="yep"/>
</blabla:anothertag>
</blabla:sometag>
</magic>
</blabla>
So every node (that is in the "default" namespace) has "blaba" prefix attached to it
<blabla:title />
How to avoid that? When inspecting the ongoings if changing the PHP to
while($blablaNode->hasChildNodes()) {
$removedChild = $blablaNode->removeChild($blablaNode->firstChild);
echo "(prefix for removed:".$removedChild->prefix.")";
$magicElement->appendChild($removedChild);
echo "(prefix for added:".$magicElement->lastChild->prefix.")";
}
echo is ...(prefix for removed:)(prefix for added:)(prefix for removed:)(prefix for added:default)...
Many thanks in advance!
P.S. This is sequel to this question thus "Or maybe someone has a much better solution in general for achieving the desirable result [adding magic node and pushing everything in it]" still applies...
Indeed, if "putting default namespace declaration first", as Josh Davis notes, the lookup prefix goes away. +1. But that's it as in the output...
...
<metadata xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cc="http://creativecommons.org/ns#">
...
... the declarations still are there. A clarification. I'm not the creator of those XML docs. Therefore checking the position of default namespace declaration... even if implemented it still wouldn't give the desirable result. And even if those declarations added by libxml should be there by standard, my task is not to validate conformance, but
- simply put all original childnodes, intact in their content (declarations, names values, attributes etc.), under that extra newly created container.When you append those children, I guess that libxml looks for the first namespace declaration for "http://www.w3.org/2000/blabla" and finds "blabla". Now if you put your default namespace declaration first, it will find that the default namespace works and it will not prefix those nodes with blabla.
<blabla xmlns="http://www.w3.org/2000/blabla"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:blabla="http://www.w3.org/2000/blabla"
version="1.0">
Update
The issue is entirely cosmetic, but if you want to remove redundant namespace declarations, you can dump and reload your XML:
$xml = $doc->saveXML();
$doc = new DOMDocument;
$doc->loadXML($xml, LIBXML_NSCLEAN);
echo $doc->saveXML();
Attention if you reuse the $doc
variable, it doesn't mean that stuff like $blablaNode
will remain functional, it won't. The new $doc
is a new document.
Oh, and it will also clean up redundant namespaces from the original document, possibly breaking that "keeping it intact" rule.
Oh, and I forgot to mention that you have to explicitely declare which namespace <magic/>
is to be created into:
$magicElement = $doc->createElementNS('http://www.w3.org/2000/blabla', 'magic');
精彩评论