XSLT: Splitting none continues Elements/Grouping continues Elements
Need some help with this problem in implementing with XSLT, I had already implemented a Java code of this one using SAX parser, but it is a troublesome due to customer request to change something.
So we are doing it now using an XSLT with doesn't need to be compiled and deployed to a web server. I have XML like below.
Example 1:
<ShotRows>
<ShotRow row="3" col="3" bit="1" position="1"/>
<ShotRow row="3" col="4" bit="1" position="2"/>
<ShotRow row="3" col="5" bit="1" position="3"/>
<ShotRow row="3" col="6" bit="1" position="4"/>
<ShotRow row="3" col="7" bit="1" position="5"/>
<ShotRow row="3" col="8" bit="1" position="6"/>
<ShotRow row="3" col="9" bit="1" position="7"/>
<ShotRow row="3" col="10" bit="1" position="8"/>
<ShotRow row="3" col="11" bit="1" position="9"/>
</ShotRows>
Output 1:
<ShotRows>
<ShotRow row="3" colStart="3" colEnd="11" />
</ShotRows>
<!-- because the col is continuous from 3 to 11 -->
Examp开发者_开发知识库le 2:
<ShotRows>
<ShotRow row="3" col="3" bit="1" position="1"/>
<ShotRow row="3" col="4" bit="1" position="2"/>
<ShotRow row="3" col="6" bit="1" position="3"/>
<ShotRow row="3" col="7" bit="1" position="4"/>
<ShotRow row="3" col="8" bit="1" position="5"/>
<ShotRow row="3" col="10" bit="1" position="6"/>
<ShotRow row="3" col="11" bit="1" position="7"/>
<ShotRow row="3" col="15" bit="1" position="8"/>
<ShotRow row="3" col="19" bit="1" position="9"/>
</ShotRows>
Output 2:
<ShotRows>
<ShotRow row="3" colStart="3" colEnd="4" />
<ShotRow row="3" colStart="6" colEnd="8" />
<ShotRow row="3" colStart="10" colEnd="11" />
<ShotRow row="3" colStart="15" colEnd="15" />
<ShotRow row="3" colStart="19" colEnd="19" />
</ShotRows>
The basic idea is to group any continuous col into one element, like the col 3 to 4, col 6 to 8, col 10 to 11, col 15 is only one, and col 19 is only one. Thanks in advance.
With Java you could use Saxon 9 and XSLT 2.0 as follows:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="ShotRows">
<xsl:copy>
<xsl:for-each-group select="ShotRow" group-adjacent="number(@col) - position()">
<ShotRow row="{@row}" colStart="{@col}" colEnd="{@col + count(current-group()) - 1}"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
With carefully crafted XPath expressions, this is a simple select-and-copy operation.
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:template match="ShotRows">
<xsl:copy>
<xsl:apply-templates select="ShotRow[
not(preceding-sibling::ShotRow)
or
not(@col = preceding-sibling::ShotRow[1]/@col + 1)
]" />
</xsl:copy>
</xsl:template>
<xsl:template match="ShotRow">
<xsl:copy>
<xsl:copy-of select="@row" />
<xsl:attribute name="colStart">
<xsl:value-of select="@col" />
</xsl:attribute>
<xsl:attribute name="colEnd">
<xsl:value-of select="(. | following-sibling::ShotRow)[
not(@col = following-sibling::ShotRow[1]/@col - 1)
][1]/@col" />
</xsl:attribute>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Produces the exact same output you ask for. The first XPath expression is:
ShotRow[
not(preceding-sibling::ShotRow)
or
not(@col = preceding-sibling::ShotRow[1]/@col + 1)
]
and it selects all <ShotRow>
nodes that
- either have no predecessor, i.e. the first (or only) one in a set
- or their
@col
is not exactly one more than their respective predecessor's - ergo: these conditions denote the start of a consecutive range
- I have marked all positions for whom this is true with
#s
below
The second expression is a tiny bit more delicate:
(. | following-sibling::ShotRow)[
not(@col = following-sibling::ShotRow[1]/@col - 1)
][1]/@col
(. | following-sibling::ShotRow)
is the union of the current node and all following siblings — I would use "following-sibling-or-self", but unfortunately such an axis does not exist ;)- of these nodes, it selects the ones whose
@col
is not one less then their respective successor's - ergo: this condition denotes the end of a consecutive range (note that this selects all ends of any consecutive range ahead)
- of these nodes, it takes the first one (it is logical that we are interested in the "first end of a consecutive range", or the one closest to us)
- I have marked all positions for whom this is true with
#e
below
Your example:
<ShotRows>
<ShotRow row="3" col="3" bit="1" position="1"/><!-- #s -->
<ShotRow row="3" col="4" bit="1" position="2"/><!-- #e -->
<ShotRow row="3" col="6" bit="1" position="3"/><!-- #s -->
<ShotRow row="3" col="7" bit="1" position="4"/>
<ShotRow row="3" col="8" bit="1" position="5"/><!-- #e -->
<ShotRow row="3" col="10" bit="1" position="6"/><!-- #s -->
<ShotRow row="3" col="11" bit="1" position="7"/><!-- #e -->
<ShotRow row="3" col="15" bit="1" position="8"/><!-- #s #e -->
<ShotRow row="3" col="19" bit="1" position="9"/><!-- #s #e -->
</ShotRows>
Output:
<ShotRows>
<ShotRow row="3" colStart="3" colEnd="4" />
<ShotRow row="3" colStart="6" colEnd="8" />
<ShotRow row="3" colStart="10" colEnd="11" />
<ShotRow row="3" colStart="15" colEnd="15" />
<ShotRow row="3" colStart="19" colEnd="19" />
</ShotRows>
EDIT - A modified version of the above uses XSL key. For large input documents, a performance gain should become noticeable, chiefly because 'kEnd' cuts down processing time. 'kStart' does not have too much impact, I included it for code symmetry only.
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:key
name="kStart"
match="ShotRow[
not(preceding-sibling::ShotRow)
or
not(@col = preceding-sibling::ShotRow[1]/@col + 1)
]"
use="generate-id(..)"
/>
<xsl:key
name="kEnd"
match="ShotRow[
(. | following-sibling::ShotRow)[
not(@col = following-sibling::ShotRow[1]/@col - 1)
]
]"
use="concat(generate-id(..), ':', generate-id())"
/>
<xsl:template match="ShotRows">
<xsl:copy>
<xsl:apply-templates select="key('kStart', generate-id(.))" />
</xsl:copy>
</xsl:template>
<xsl:template match="ShotRow">
<xsl:copy>
<xsl:copy-of select="@row" />
<xsl:attribute name="colStart">
<xsl:value-of select="@col" />
</xsl:attribute>
<xsl:attribute name="colEnd">
<xsl:value-of select="key('kEnd',
concat(generate-id(..), ':', generate-id())
)[1]/@col" />
</xsl:attribute>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The logic is the exactly same as explained above.
This feels a little messy, as iterative processing often does in XSLT.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:template match="ShotRows">
<xsl:copy>
<xsl:apply-templates select="ShotRow[1]" />
</xsl:copy>
</xsl:template>
<xsl:template match="ShotRow">
<xsl:call-template name="ShotRow">
<xsl:with-param name="start" select="@col" />
<xsl:with-param name="shotrow" select="." />
</xsl:call-template>
</xsl:template>
<xsl:template name="ShotRow">
<xsl:param name="start" />
<xsl:param name="shotrow" />
<xsl:choose>
<xsl:when test="$shotrow/@row = $shotrow/following-sibling::ShotRow[1]/@row and 1 + number($shotrow/@col) = number($shotrow/following-sibling::ShotRow[1]/@col)">
<xsl:call-template name="ShotRow">
<xsl:with-param name="start" select="$start" />
<xsl:with-param name="shotrow" select="$shotrow/following-sibling::ShotRow[1]" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<ShotRow row="{$shotrow/@row}" colStart="{$start}" colEnd="{$shotrow/@col}" />
<xsl:apply-templates select="$shotrow/following-sibling::ShotRow[1]" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
精彩评论