开发者

XSLT: Splitting none continues Elements/Grouping continues Elements

Need some help with this problem in implementing with XSLT, I had already implemented a Java code of this one using SAX parser, but it is a troublesome due to customer request to change something.

So we are doing it now using an XSLT with doesn't need to be compiled and deployed to a web server. I have XML like below.

Example 1:

<ShotRows>
  <ShotRow row="3" col="3" bit="1" position="1"/>
  <ShotRow row="3" col="4" bit="1" position="2"/>
  <ShotRow row="3" col="5" bit="1" position="3"/>
  <ShotRow row="3" col="6" bit="1" position="4"/>
  <ShotRow row="3" col="7" bit="1" position="5"/>
  <ShotRow row="3" col="8" bit="1" position="6"/>
  <ShotRow row="3" col="9" bit="1" position="7"/>
  <ShotRow row="3" col="10" bit="1" position="8"/>
  <ShotRow row="3" col="11" bit="1" position="9"/>
</ShotRows>

Output 1:

<ShotRows>
  <ShotRow row="3" colStart="3" colEnd="11" />
</ShotRows>
<!-- because the col is continuous from 3 to 11 -->

Examp开发者_开发知识库le 2:

<ShotRows>
  <ShotRow row="3" col="3" bit="1" position="1"/>
  <ShotRow row="3" col="4" bit="1" position="2"/>
  <ShotRow row="3" col="6" bit="1" position="3"/>
  <ShotRow row="3" col="7" bit="1" position="4"/>
  <ShotRow row="3" col="8" bit="1" position="5"/>
  <ShotRow row="3" col="10" bit="1" position="6"/>
  <ShotRow row="3" col="11" bit="1" position="7"/>
  <ShotRow row="3" col="15" bit="1" position="8"/>
  <ShotRow row="3" col="19" bit="1" position="9"/>
</ShotRows>

Output 2:

<ShotRows>
  <ShotRow row="3" colStart="3" colEnd="4" />
  <ShotRow row="3" colStart="6" colEnd="8" />
  <ShotRow row="3" colStart="10" colEnd="11" />
  <ShotRow row="3" colStart="15" colEnd="15" />
  <ShotRow row="3" colStart="19" colEnd="19" />
</ShotRows>

The basic idea is to group any continuous col into one element, like the col 3 to 4, col 6 to 8, col 10 to 11, col 15 is only one, and col 19 is only one. Thanks in advance.


With Java you could use Saxon 9 and XSLT 2.0 as follows:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">

  <xsl:output indent="yes"/>

  <xsl:template match="ShotRows">
    <xsl:copy>
      <xsl:for-each-group select="ShotRow" group-adjacent="number(@col) - position()">
        <ShotRow row="{@row}" colStart="{@col}" colEnd="{@col + count(current-group()) - 1}"/>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>


With carefully crafted XPath expressions, this is a simple select-and-copy operation.

<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
>
  <xsl:template match="ShotRows">
    <xsl:copy>
      <xsl:apply-templates select="ShotRow[
        not(preceding-sibling::ShotRow) 
        or 
        not(@col = preceding-sibling::ShotRow[1]/@col + 1)
      ]" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="ShotRow">
    <xsl:copy>
      <xsl:copy-of select="@row" />
      <xsl:attribute name="colStart">
        <xsl:value-of select="@col" />
      </xsl:attribute>
      <xsl:attribute name="colEnd">
        <xsl:value-of select="(. | following-sibling::ShotRow)[
          not(@col = following-sibling::ShotRow[1]/@col - 1)
        ][1]/@col" />
      </xsl:attribute>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Produces the exact same output you ask for. The first XPath expression is:

ShotRow[
  not(preceding-sibling::ShotRow) 
  or 
  not(@col = preceding-sibling::ShotRow[1]/@col + 1)
]

and it selects all <ShotRow> nodes that

  • either have no predecessor, i.e. the first (or only) one in a set
  • or their @col is not exactly one more than their respective predecessor's
  • ergo: these conditions denote the start of a consecutive range
  • I have marked all positions for whom this is true with #s below

The second expression is a tiny bit more delicate:

(. | following-sibling::ShotRow)[
  not(@col = following-sibling::ShotRow[1]/@col - 1)
][1]/@col
  • (. | following-sibling::ShotRow) is the union of the current node and all following siblings — I would use "following-sibling-or-self", but unfortunately such an axis does not exist ;)
  • of these nodes, it selects the ones whose @col is not one less then their respective successor's
  • ergo: this condition denotes the end of a consecutive range (note that this selects all ends of any consecutive range ahead)
  • of these nodes, it takes the first one (it is logical that we are interested in the "first end of a consecutive range", or the one closest to us)
  • I have marked all positions for whom this is true with #e below

Your example:

<ShotRows>
  <ShotRow row="3" col="3" bit="1" position="1"/><!-- #s -->
  <ShotRow row="3" col="4" bit="1" position="2"/><!-- #e -->
  <ShotRow row="3" col="6" bit="1" position="3"/><!-- #s -->
  <ShotRow row="3" col="7" bit="1" position="4"/>
  <ShotRow row="3" col="8" bit="1" position="5"/><!-- #e -->
  <ShotRow row="3" col="10" bit="1" position="6"/><!-- #s -->
  <ShotRow row="3" col="11" bit="1" position="7"/><!-- #e -->
  <ShotRow row="3" col="15" bit="1" position="8"/><!-- #s #e -->
  <ShotRow row="3" col="19" bit="1" position="9"/><!-- #s #e -->
</ShotRows>

Output:

<ShotRows>
  <ShotRow row="3" colStart="3" colEnd="4" />
  <ShotRow row="3" colStart="6" colEnd="8" />
  <ShotRow row="3" colStart="10" colEnd="11" />
  <ShotRow row="3" colStart="15" colEnd="15" />
  <ShotRow row="3" colStart="19" colEnd="19" />
</ShotRows>

EDIT - A modified version of the above uses XSL key. For large input documents, a performance gain should become noticeable, chiefly because 'kEnd' cuts down processing time. 'kStart' does not have too much impact, I included it for code symmetry only.

<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
>
  <xsl:key 
    name="kStart" 
    match="ShotRow[
      not(preceding-sibling::ShotRow) 
      or 
      not(@col = preceding-sibling::ShotRow[1]/@col + 1)
    ]" 
    use="generate-id(..)" 
  />
  <xsl:key 
    name="kEnd" 
    match="ShotRow[
      (. | following-sibling::ShotRow)[
        not(@col = following-sibling::ShotRow[1]/@col - 1)
      ]
    ]" 
    use="concat(generate-id(..), ':', generate-id())" 
  />

  <xsl:template match="ShotRows">
    <xsl:copy>
      <xsl:apply-templates select="key('kStart', generate-id(.))" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="ShotRow">
    <xsl:copy>
      <xsl:copy-of select="@row" />
      <xsl:attribute name="colStart">
        <xsl:value-of select="@col" />
      </xsl:attribute>
      <xsl:attribute name="colEnd">          
        <xsl:value-of select="key('kEnd', 
          concat(generate-id(..), ':', generate-id())
        )[1]/@col" />
      </xsl:attribute>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

The logic is the exactly same as explained above.


This feels a little messy, as iterative processing often does in XSLT.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes" />

    <xsl:template match="ShotRows">
        <xsl:copy>
            <xsl:apply-templates select="ShotRow[1]" />
        </xsl:copy>
    </xsl:template>

    <xsl:template match="ShotRow">
        <xsl:call-template name="ShotRow">
            <xsl:with-param name="start" select="@col" />
            <xsl:with-param name="shotrow" select="." />
        </xsl:call-template>
    </xsl:template>

    <xsl:template name="ShotRow">
        <xsl:param name="start" />
        <xsl:param name="shotrow" />

        <xsl:choose>
            <xsl:when test="$shotrow/@row = $shotrow/following-sibling::ShotRow[1]/@row and 1 + number($shotrow/@col) = number($shotrow/following-sibling::ShotRow[1]/@col)">
                <xsl:call-template name="ShotRow">
                    <xsl:with-param name="start" select="$start" />
                    <xsl:with-param name="shotrow" select="$shotrow/following-sibling::ShotRow[1]" />
                </xsl:call-template>

            </xsl:when>
            <xsl:otherwise>
                <ShotRow row="{$shotrow/@row}" colStart="{$start}" colEnd="{$shotrow/@col}" />
                <xsl:apply-templates select="$shotrow/following-sibling::ShotRow[1]" />

            </xsl:otherwise>
        </xsl:choose>

    </xsl:template>

</xsl:stylesheet>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜