
XSLT: Splitting none continues Elements/Grouping continues Elements

Need some help with this problem in implementing with XSLT, I had already implemented a Java code of this one using SAX parser, but it is a troublesome due to customer request to change something.

So we are doing it now using an XSLT with doesn't need to be compiled and deployed to a web server. I have XML like below.

Example 1:

  <ShotRow row="3" col="3" bit="1" position="1"/>
  <ShotRow row="3" col="4" bit="1" position="2"/>
  <ShotRow row="3" col="5" bit="1" position="3"/>
  <ShotRow row="3" col="6" bit="1" position="4"/>
  <ShotRow row="3" col="7" bit="1" position="5"/>
  <ShotRow row="3" col="8" bit="1" position="6"/>
  <ShotRow row="3" col="9" bit="1" position="7"/>
  <ShotRow row="3" col="10" bit="1" position="8"/>
  <ShotRow row="3" col="11" bit="1" position="9"/>

Output 1:

  <ShotRow row="3" colStart="3" colEnd="11" />
<!-- because the col is continuous from 3 to 11 -->

Example 2:

  <ShotRow row="3" col="3" bit="1" position="1"/>
  <ShotRow row="3" col="4" bit="1" position="2"/>
  <ShotRow row="3" col="6" bit="1" position="3"/>
  <ShotRow row="3" col="7" bit="1" position="4"/>
  <ShotRow row="3" col="8" bit="1" position="5"/>
  <ShotRow row="3" col="10" bit="1" position="6"/>
  <ShotRow row="3" col="11" bit="1" position="7"/>
  <ShotRow row="3" col="15" bit="1" position="8"/>
  <ShotRow row="3" col="19" bit="1" position="9"/>

Output 2:

  <ShotRow row="3" colStart="3" colEnd="4" />
  <ShotRow row="3" colStart="6" colEnd="8" />
  <ShotRow row="3" colStart="10" colEnd="11" />
  <ShotRow row="3" colStart="15" colEnd="15" />
  <ShotRow row="3" colStart="19" colEnd="19" />

The basic idea is to group any continuous col into one element, like the col 3 to 4, col 6 to 8, col 10 to 11, col 15 is only one, and col 19 is only one. Thanks in advance.

With Java you could use Saxon 9 and XSLT 2.0 as follows:


  <xsl:output indent="yes"/>

  <xsl:template match="ShotRows">
      <xsl:for-each-group select="ShotRow" group-adjacent="number(@col) - position()">
        <ShotRow row="{@row}" colStart="{@col}" colEnd="{@col + count(current-group()) - 1}"/>


With carefully crafted XPath expressions, this is a simple select-and-copy operation.

  <xsl:template match="ShotRows">
      <xsl:apply-templates select="ShotRow[
        not(@col = preceding-sibling::ShotRow[1]/@col + 1)
      ]" />

  <xsl:template match="ShotRow">
      <xsl:copy-of select="@row" />
      <xsl:attribute name="colStart">
        <xsl:value-of select="@col" />
      <xsl:attribute name="colEnd">
        <xsl:value-of select="(. | following-sibling::ShotRow)[
          not(@col = following-sibling::ShotRow[1]/@col - 1)
        ][1]/@col" />


Produces the exact same output you ask for. The first XPath expression is:

  not(@col = preceding-sibling::ShotRow[1]/@col + 1)

and it selects all <ShotRow> nodes that

  • either have no predecessor, i.e. the first (or only) one in a set
  • or their @col is not exactly one more than their respective predecessor's
  • ergo: these conditions denote the start of a consecutive range
  • I have marked all positions for whom this is true with #s below

The second expression is a tiny bit more delicate:

(. | following-sibling::ShotRow)[
  not(@col = following-sibling::ShotRow[1]/@col - 1)
  • (. | following-sibling::ShotRow) is the union of the current node and all following siblings — I would use "following-sibling-or-self", but unfortunately such an axis does not exist ;)
  • of these nodes, it selects the ones whose @col is not one less then their respective successor's
  • ergo: this condition denotes the end of a consecutive range (note that this selects all ends of any consecutive range ahead)
  • of these nodes, it takes the first one (it is logical that we are interested in the "first end of a consecutive range", or the one closest to us)
  • I have marked all positions for whom this is true with #e below

Your example:

  <ShotRow row="3" col="3" bit="1" position="1"/><!-- #s -->
  <ShotRow row="3" col="4" bit="1" position="2"/><!-- #e -->
  <ShotRow row="3" col="6" bit="1" position="3"/><!-- #s -->
  <ShotRow row="3" col="7" bit="1" position="4"/>
  <ShotRow row="3" col="8" bit="1" position="5"/><!-- #e -->
  <ShotRow row="3" col="10" bit="1" position="6"/><!-- #s -->
  <ShotRow row="3" col="11" bit="1" position="7"/><!-- #e -->
  <ShotRow row="3" col="15" bit="1" position="8"/><!-- #s #e -->
  <ShotRow row="3" col="19" bit="1" position="9"/><!-- #s #e -->


  <ShotRow row="3" colStart="3" colEnd="4" />
  <ShotRow row="3" colStart="6" colEnd="8" />
  <ShotRow row="3" colStart="10" colEnd="11" />
  <ShotRow row="3" colStart="15" colEnd="15" />
  <ShotRow row="3" colStart="19" colEnd="19" />

EDIT - A modified version of the above uses XSL key. For large input documents, a performance gain should become noticeable, chiefly because 'kEnd' cuts down processing time. 'kStart' does not have too much impact, I included it for code symmetry only.

      not(@col = preceding-sibling::ShotRow[1]/@col + 1)
      (. | following-sibling::ShotRow)[
        not(@col = following-sibling::ShotRow[1]/@col - 1)
    use="concat(generate-id(..), ':', generate-id())" 

  <xsl:template match="ShotRows">
      <xsl:apply-templates select="key('kStart', generate-id(.))" />

  <xsl:template match="ShotRow">
      <xsl:copy-of select="@row" />
      <xsl:attribute name="colStart">
        <xsl:value-of select="@col" />
      <xsl:attribute name="colEnd">          
        <xsl:value-of select="key('kEnd', 
          concat(generate-id(..), ':', generate-id())
        )[1]/@col" />


The logic is the exactly same as explained above.

This feels a little messy, as iterative processing often does in XSLT.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes" />

    <xsl:template match="ShotRows">
            <xsl:apply-templates select="ShotRow[1]" />

    <xsl:template match="ShotRow">
        <xsl:call-template name="ShotRow">
            <xsl:with-param name="start" select="@col" />
            <xsl:with-param name="shotrow" select="." />

    <xsl:template name="ShotRow">
        <xsl:param name="start" />
        <xsl:param name="shotrow" />

            <xsl:when test="$shotrow/@row = $shotrow/following-sibling::ShotRow[1]/@row and 1 + number($shotrow/@col) = number($shotrow/following-sibling::ShotRow[1]/@col)">
                <xsl:call-template name="ShotRow">
                    <xsl:with-param name="start" select="$start" />
                    <xsl:with-param name="shotrow" select="$shotrow/following-sibling::ShotRow[1]" />

                <ShotRow row="{$shotrow/@row}" colStart="{$start}" colEnd="{$shotrow/@col}" />
                <xsl:apply-templates select="$shotrow/following-sibling::ShotRow[1]" />







