开发者

How do I prevent duplicates, in XSL?

How do I prevent duplicate entries into a list, and then ideally, sort that list? What I'm doing, is when information at one level is missing, taking the information from a level below it, to building the missing list, in the level above. Currently, I have XML similar to this:

<c03 id="ref6488" level="file">
    <did>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id=开发者_开发百科"ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

I then apply the following XSL:

<xsl:template match="c03/did">
    <xsl:choose>
        <xsl:when test="not(container)">
            <did>
                <!-- If no c03 container item is found, look in the c04 level for one -->
                <xsl:if test="../c04/did/container">

                    <!-- If a c04 container item is found, use the info to build a c03 version -->
                    <!-- Skip c03 container item, if still no c04 items found -->
                    <container label="Box" type="Box">

                        <!-- Build container list -->
                        <!-- Test for more than one item, and if so, list them, -->
                        <!-- separated by commas and a space -->
                        <xsl:for-each select="../c04/did">
                            <xsl:if test="position() &gt; 1">, </xsl:if>
                            <xsl:value-of select="container"/>
                        </xsl:for-each>
                    </container>                    
            </did>
        </xsl:when>

        <!-- If there is a c03 container item(s), list it normally -->
        <xsl:otherwise>
            <xsl:copy-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

But I'm getting the "container" result of

<container label="Box" type="Box">156, 156, 154</container>

when what I want is

<container label="Box" type="Box">154, 156</container>

Below is the full result that I'm trying to get:

<c03 id="ref6488" level="file">
    <did>
        <container label="Box" type="Box">154, 156</container>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

Thanks in advance for any help!


There is no need for an XSLT 2.0 solution for this problem.

Here is an XSLT 1.0 solution, which is more compact than the currently selected XSLT 2.0 solution (35 lines vs. 43 lines):

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="kBoxContainerByVal"
     match="container[@type='Box']" use="."/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="c03/did[not(container)]">
   <xsl:copy>

   <xsl:variable name="vContDistinctValues" select=
    "/*/*/*/container[@type='Box']
            [generate-id()
            =
             generate-id(key('kBoxContainerByVal', .)[1])
            ]
            "/>

    <container label="Box" type="Box">
      <xsl:for-each select="$vContDistinctValues">
        <xsl:sort data-type="number"/>

        <xsl:value-of select=
        "concat(., substring(', ', 1 + 2*(position() = last())))"/>
      </xsl:for-each>
    </container>
    <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the originally provided XML document, the correct, wanted result is produced:

<c03 id="ref6488" level="file">
   <did>
      <container label="Box" type="Box">156, 154</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
   <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
   </c04>
   <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
   </c04>
   <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
   </c04>
</c03>

Update:

I didn't notice the requirement that the container numbers must appear sorted. Now the solution reflects this.


Try the following code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output indent="yes"></xsl:output>

<xsl:template match="node() | @*">
  <xsl:copy>
    <xsl:apply-templates select="node() | @*"/>
  </xsl:copy>
</xsl:template>

  <xsl:template match="c03/did">
    <xsl:choose>
      <xsl:when test="not(container)">
        <did>
          <!-- If no c03 container item is found, look in the c04 level for one -->
          <xsl:if test="../c04/did/container">
            <xsl:variable name="foo" select="../c04/did/container[@type='Box']/text()"/>
            <!-- If a c04 container item is found, use the info to build a c03 version -->
            <!-- Skip c03 container item, if still no c04 items found -->
            <container label="Box" type="Box">

              <!-- Build container list -->
              <!-- Test for more than one item, and if so, list them, -->
              <!-- separated by commas and a space -->
              <xsl:for-each select="distinct-values($foo)">
                <xsl:sort />
                <xsl:if test="position() &gt; 1">, </xsl:if>
                <xsl:value-of select="." />
              </xsl:for-each>
            </container>
            <xsl:apply-templates select="*" />
          </xsl:if>
        </did>
      </xsl:when>

      <!-- If there is a c03 container item(s), list it normally -->
      <xsl:otherwise>
        <xsl:copy-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

It looks pretty much as the output you want:

<?xml version="1.0" encoding="UTF-8"?>
<c03 id="ref6488" level="file">
  <did>
      <container label="Box" type="Box">154, 156</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
  <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
  </c04>
  <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
  </c04>
  <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
  </c04>
</c03>

The trick is to use <xsl:sort> and distinct-values() together. See the (IMHO) great book from Michael Key "XSLT 2.0 and XPATH 2.0"


try using a Key group in xslt, here's an article on the Muenchian method which should help to eliminate duplicates. http://www.jenitennison.com/xslt/grouping/muenchian.html


A slightly shorter XSLT 2.0 version, combining approaches from other answers. Note that sorting is alphabetical, so that if the labels "54" and "156" are found, the output will be "156, 54". If a numerical sort is needed, use <xsl:sort select="number(.)"/> instead of <xsl:sort/>.

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 
    <xsl:strip-space elements="*"/>

    <xsl:template match="node()|@*"> 
        <xsl:copy> 
            <xsl:apply-templates select="node()|@*"/> 
        </xsl:copy> 
    </xsl:template> 

    <xsl:template match="c03/did[not(container)]">
        <xsl:variable name="containers" 
                      select="../c04/did/container[@label='Box'][text()]"/>
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:if test="$containers">
                <container label="Box" type="Box">
                    <xsl:for-each select="distinct-values($containers)">
                        <xsl:sort/>
                        <xsl:if test="position() != 1">, </xsl:if>
                        <xsl:value-of select="."/>
                    </xsl:for-each>
                </container> 
            </xsl:if>
            <xsl:apply-templates select="node()"/> 
        </xsl:copy> 
    </xsl:template> 
</xsl:stylesheet>


A truly XSLT 2.0 solution, also quite short:

<xsl:stylesheet  version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs"
>
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="c03/did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*"/>

      <xsl:variable name="vContDistinctValues" as="xs:integer*">
        <xsl:perform-sort select=
          "distinct-values(/*/*/*/container[@type='Box']/text()/xs:integer(.))">
          <xsl:sort/>
        </xsl:perform-sort>
      </xsl:variable>

      <xsl:if test="$vContDistinctValues">
        <container label="Box" type="Box">
          <xsl:value-of select="$vContDistinctValues" separator=","/>
        </container>
      </xsl:if>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Do note:

  1. The use of types avoids the need to specify the data-type in <xsl:sort/> .

  2. The use of the separator attribute of <xsl:value-of/>


The following XSLT 1.0 transformation does what you are looking for

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> 
  <xsl:output encoding="utf-8" />

  <!-- key to index containers by these three distinct qualities: 
       1: their ancestor <c??> node (represented as its unique ID)
       2: their @type attribute value
       3: their node value (i.e. their text) -->
  <xsl:key 
    name  = "kContainer" 
    match = "container"
    use   = "concat(generate-id(../../..), '|', @type, '|', .)"
  />

  <!-- identity template to copy everything as is by default -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <!-- special template for <did>s without a <container> child -->
  <xsl:template match="did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*" />
      <container label="Box" type="Box">
        <!-- from subordinate <container>s of type Box, use the ones
             that are *the first* to have that certain combination 
             of the three distinct qualities mentioned above -->
        <xsl:apply-templates mode="list-values" select="
          ../*/did/container[@type='Box'][
            generate-id()
            =
            generate-id(
              key(
                'kContainer', 
                concat(generate-id(../../..), '|', @type, '|', .)
              )[1]
            )
          ]
        ">
          <!-- sort them by their node value -->
          <xsl:sort select="." data-type="number" />
        </xsl:apply-templates>
      </container>
      <xsl:apply-templates select="node()" />
    </xsl:copy>
  </xsl:template>

  <!-- generic template to make list of values from any node-set -->
  <xsl:template match="*" mode="list-values">
    <xsl:value-of select="." />
    <xsl:if test="position() &lt; last()">
      <xsl:text>, </xsl:text>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

Returns

<c03 id="ref6488" level="file">
  <did>
    <container label="Box" type="Box">154, 156</container>
    <unittitle>Clinic Building</unittitle>
    <unitdate era="ce" calendar="gregorian">1947</unitdate>
  </did>
  <c04 id="ref34582" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <container label="Folder" type="Folder">3</container>
    </did>
  </c04>
  <c04 id="ref6540" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <unittitle>Contact prints</unittitle>
    </did>
  </c04>
  <c04 id="ref6606" level="file">
    <did>
      <container label="Box" type="Box">154</container>
      <unittitle>Negatives</unittitle>
    </did>
  </c04>
</c03>

The generate-id() = generate-id(key(...)[1]) part is what's called Muenchian grouping. Unless you can use XSLT 2.0, this is the way to go.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜