开发者

How to remove duplicate xml-nodes using xslt?

I want to remove duplicates when all variables are exact matches using xslt.

In this xml node 3 should be removed because it is a perfect copy of node 1.

<root> 
    <trips> 
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Swe开发者_Go百科den</from> 
        <to>Gothenburg, Sweden</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Sweden</from> 
        <to>New york, USA</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Sweden</from> 
        <to>Gothenburg, Sweden</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>1</got_car> 
        <from>Test, Duncan, NM 85534, USA</from> 
        <to>Test, Duncan, NM 85534, USA</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip> 
    <trips> 
<root>


With a better desing, this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kTripByContent" match="trip"
             use="concat(got_car,'+',from,'+',to,'+',when_iso)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="trip[generate-id() !=
                              generate-id(key('kTripByContent',
                                              concat(got_car,'+',
                                                     from,'+',
                                                     to,'+',
                                                     when_iso))[1])]"/>
</xsl:stylesheet>

Output:

<root>
    <trips>
        <trip>
            <got_car>0</got_car>
            <from>Stockholm, Sweden</from>
            <to>Gothenburg, Sweden</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
        <trip>
            <got_car>0</got_car>
            <from>Stockholm, Sweden</from>
            <to>New york, USA</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
        <trip>
            <got_car>1</got_car>
            <from>Test, Duncan, NM 85534, USA</from>
            <to>Test, Duncan, NM 85534, USA</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
    </trips>
</root>


This code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>

<xsl:key name="trip-tth" match="/root/trips/trip" use="concat(got_car, '+', from, '+', to, '+', when_iso)"/>

<xsl:template match="root/trips">   
    <xsl:copy>
        <xsl:apply-templates select="trip[generate-id(.) = generate-id( key ('trip-tth', concat(got_car, '+', from, '+', to, '+', when_iso) ) )]"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="trip">
    <xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>

Will do the trick.

It utilizes the fact that generate-id() applied to a key will take the id of the first node, that matches a given criteria. And in our case criteria is concatenated value of each trip child element.


If you are using XSLT 1.0, this answer may help: How to remove duplicate XML nodes using XSLT. It is easier with XSLT 2.0 but that is not universally deployed

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜