开发者

Searching an XML and getting a subset of the nodes as an XML

Given a search term, how to search the attributes of nodes in an XML and return the XML which contains only those nodes that match the term along with their parents all the way tracing to the root node.

Here is an example of the input XML:

<root>
  <node name = "Amaths"> 
    <node name = "Bangles"开发者_JS百科/> 
  </node>
  <node name = "C">
    <node name = "Dangles">
      <node name = "E"> 
        <node name = "Fangles"/> 
      </node>
    </node>
    <node name = "Gdecimals" />
  </node>
  <node name = "Hnumbers"/> 
  <node name = "Iangles"/> 
</root>

The output I'm looking for the search term "angles":

<root>
  <node name = "Amaths"> 
    <node name = "Bangles"/> 
  </node>
  <node name = "C">
    <node name = "Dangles">
      <node name = "E"> 
        <node name = "Fangles"/> 
      </node>
    </node>
  </node>
  <node name = "Iangles"/> 
</root>

The XPath that I use to search the xml is "//*[contains(@name,'angles')]"

I'm using Nokogiri in Ruby to search the XML which provides me a NodeSet of all nodes that match the term. I cannot figure out how to construct back the XML from that set of nodes.

Thanks!

EDIT: Fixed the example should have been . Thanks Dimitre.

EDIT 2: Fixed the xml again for well-formedness.


First, do note that the presented wanted output is incorrect and the following element has no end tag later in the document:

<node name = "C">

The results of evaluating an XPath expressions can be a set of nodes from the XML document, but these notes can't be altered by XPath.

This XPath expression selects the

nodes that match the term along with their parents all the way tracing to the root node

//*[contains(@name,'angles') and not(node())]/ancestor::*

However, the nodes are not changed and they contain all their children, meaning that the complete subtree rooted in Root still is a the subtree of Root in the returned result.

In case you want to obtain a new document (set of nodes) with different structure than the original XML document, you have to use another language that is hosting XPath. There are many such languages, such as XSLT, XQuery and any language with an XML DOM implementation.

Here is an XSLT transformation, producing the wanted result:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="*[not(descendant-or-self::*[contains(@name, 'angles')])]"/>
</xsl:stylesheet>

when this transformation is applied on the provided XML document(corrected to be well-formed):

<root>
  <node name = "Amaths">
    <node name = "Bangles"/>
  </node>
  <node name = "C">
    <node name = "Dangles">
      <node name = "E">
        <node name = "Fangles"/>
      </node>
      <node name = "Gdecimals" />
    </node>
  </node>
  <node name = "Hnumbers"/>
  <node name = "Iangles"/>
</root>

the wanted (correct) result is produced:

<root>
   <node name="Amaths">
      <node name="Bangles"/>
   </node>
   <node name="C">
      <node name="Dangles">
         <node name="E">
            <node name="Fangles"/>
         </node>
      </node>
   </node>
   <node name="Iangles"/>
</root>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜