Alphanumeric sort on mixed string value
Given XML snippet of:
<forms>
<FORM lob="BO" form_name="AI OM 10"/>
<FORM lob="BO" form_name="CL BP 03 01"/>开发者_运维百科
<FORM lob="BO" form_name="AI OM 107"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="123 DDE"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="AI OM 98"/>
</forms>
I need to sort the FORM nodes by form_name alphabetically so all the forms containing 'AI OM' in the form_name are grouped together and then within that they are in numeric order by the integers (same for other forms).
The form_name can be is open season as letters and numbers can be in any order:
XX ## ##
XX XX ## XX XX ### XX XX ## ## XX ### XX XXXX '## XXX XXX###What I THINK needs to happen is that string needs to be split between alpha and numeric. The numeric part could probably be sorted with any spaces removed I suppose.
I am at a loss as to how to split the string and then cover all the sorting/grouping combinations given that there are no rules around the 'form_name' format.
We are using XSLT 2.0. Thanks.
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vDigits" select="'0123456789 '"/>
<xsl:variable name="vAlpha" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ '"/>
<xsl:template match="/*">
<forms>
<xsl:for-each select="FORM">
<xsl:sort select="translate(@form_name,$vDigits,'')"/>
<xsl:sort select="translate(@form_name,$vAlpha,'')"
data-type="number"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</forms>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<forms>
<FORM lob="BO" form_name="AI OM 10"/>
<FORM lob="BO" form_name="CL BP 03 01"/>
<FORM lob="BO" form_name="AI OM 107"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="123 DDE"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="AI OM 98"/>
</forms>
produces the wanted, correct result:
<forms>
<FORM lob="BO" form_name="AI OM 10"/>
<FORM lob="BO" form_name="AI OM 98"/>
<FORM lob="BO" form_name="AI OM 107"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="CL BP 03 01"/>
<FORM lob="BO" form_name="123 DDE"/>
</forms>
Do note:
Two
<xsl:sort>
instructions implement the two-phase sortingThe XPath
translate()
function is used to produce either the alpha-only sort-key or the digits-only sort-key.
This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="forms">
<xsl:apply-templates>
<xsl:sort select="normalize-space(
translate(@form_name,
'0123456789',
''))"/>
<xsl:sort select="substring-before(
concat(
normalize-space(
translate(@form_name,
translate(@form_name,
'0123456789 ',
''),
'')),
' '),' ')" data-type="number"/>
<xsl:sort select="substring-after(
normalize-space(
translate(@form_name,
translate(@form_name,
'0123456789 ',
''),
'')),
' ')" data-type="number"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
Output:
<FORM lob="BO" form_name="AI OM 10"></FORM>
<FORM lob="BO" form_name="AI OM 98"></FORM>
<FORM lob="BO" form_name="AI OM 107"></FORM>
<FORM lob="BO" form_name="CL BP 00 02"></FORM>
<FORM lob="BO" form_name="CL BP 00 02"></FORM>
<FORM lob="BO" form_name="CL BP 03 01"></FORM>
<FORM lob="BO" form_name="123 DDE"></FORM>
XSLT 2.0 solution: this stylesheet
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="forms">
<xsl:apply-templates>
<xsl:sort select="string-join(tokenize(@form_name,' ')
[not(. castable as xs:integer)],
' ')"/>
<xsl:sort select="xs:integer(tokenize(@form_name,' ')
[. castable as xs:integer][1])"/>
<xsl:sort select="xs:integer(tokenize(@form_name,' ')
[. castable as xs:integer][2])"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
It should be noted that the marked answer doesn't work in all cases.
Input:
<forms>
<FORM lob="BO" form_name="AA 11 AB"/>
<FORM lob="BO" form_name="AA AZ 01"/>
</forms>
Expected Output:
<forms>
<FORM lob="BO" form_name="AA AZ 01"/>
<FORM lob="BO" form_name="AA 11 AB"/>
</forms>
Actual Output:
<forms>
<FORM lob="BO" form_name="AA 11 AB"/>
<FORM lob="BO" form_name="AA AZ 01"/>
</forms>
If letters are allowed after numbers, you cannot strip them out in the first sort key.
精彩评论