How to select unique XML nodes using Ruby?
I have the following XML, I am trying to get the unique nodes based on the name child node.
Original XML:
<products>
<product>
<name>White Socks</name>
<price>2.00</price>
</product>
<product>
<name>White Socks/name>
<price>2.00</price>
</product>
<product>
<name>Blue Socks</name>
<price>3.00</price>
</product>
</products>
What I'm trying to get:
<products>
<product>
<name>W开发者_开发问答hite Socks</name>
<price>2.00</price>
</product>
<product>
<name>Blue Socks</name>
<price>3.00</price>
</product>
</products>
I've tried various things but not worth listing here, the closest I got was using XPath but that just returned the names like below. However, this is wrong as I want the full XML as above, not just the node values.
White Socks
Blue Socks
I'm using Ruby and trying to iterate over the nodes like so:
@doc.xpath("//product").each do |node|
Obviously the above currently gets ALL product nodes, whereas I want all unique product nodes (using the child node "name" as the unique identifier)
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kProdByName" match="product"
use="name"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match=
"product
[not(generate-id()
=
generate-id(key('kProdByName',name)[1])
)
]"/>
</xsl:stylesheet>
when applied on the provided XML document (corrected to be well-formed):
<products>
<product>
<name>White Socks</name>
<price>2.00</price>
</product>
<product>
<name>White Socks</name>
<price>2.00</price>
</product>
<product>
<name>Blue Socks</name>
<price>3.00</price>
</product>
</products>
produces the wanted, correct result:
<products>
<product>
<name>White Socks</name>
<price>2.00</price>
</product>
<product>
<name>Blue Socks</name>
<price>3.00</price>
</product>
</products>
Do note:
The identity rule copies every node "as-is".
The Muenchian method for grouping is used.
There is a single overriding template that excludes any
product
element that is not the first in its group.
XPath-one-liner (Note this is O(N^2) -- will be very slow on many product
elements):
/*/product[not(name = following-sibling::product/name)]
With XSLT you can use Muenchian grouping to eliminate duplicates as follows:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:key name="prod-by-name" match="product" use="name"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="product[not(generate-id() = generate-id(key('prod-by-name', name)[1]))]"/>
</xsl:stylesheet>
精彩评论