Sort XML using XSLT
I have an xml that describes a catalog tree. It can have any number of child nodes. Here is an example:
<Catalog name="AccessoriesCatalog">
<Category Definition="AccessoriesCategory" name="1532" id="1532">
</Category>
<Category Definition="AccessoriesCategory" name="16115" id="16115">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16116" id="16116">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16126" id="16126">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16131" id="16131">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16132" id="16132">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16136" id="16136">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16139" id="16139">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16144" id="16144">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16195" id="16195">
<ParentCategory>16131</ParentCategory>
</Category>
I need to be able to sort it on Category name and ParentCategory. All parent categories shall come first in the xml and the "leaf categories" shall come last. In this sample the xm开发者_如何学运维l is already sorted.
Above xml looks like this when it is represented as a tree
1532 -16115 --16116 --16126 -16131 --16132 --16136 --16139 --16144 --16195I want it to be sorted like this
1532 -16115 -16131 --16116 --16126 --16132 --16136 --16139 --16144 --16195It can be several levels of child elements (in this case only a 3 level tree). I want all level 1 elements to come first in the xml, then all level 2 elements and then all level 3 elements etc.
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kElemById" match="Category"
use="@id"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:call-template name="sortHier">
<xsl:with-param name="pNodes" select=
"*[ParentCategory]"/>
<xsl:with-param name="pParents" select=
"*[not(ParentCategory)]"/>
</xsl:call-template>
</xsl:copy>
</xsl:template>
<xsl:template name="sortHier">
<xsl:param name="pNodes"/>
<xsl:param name="pParents"/>
<xsl:apply-templates select=
"$pParents|$pNodes[not($pParents)]">
<xsl:sort select="@name"/>
</xsl:apply-templates>
<xsl:if test="$pNodes and $pParents">
<xsl:variable name="vNewParents"
select="key('kElemById', $pNodes/ParentCategory)
[not(@id=$pParents/@id)]
"/>
<xsl:variable name="vNewChildren"
select="$pNodes[not(@id=$vNewParents/@id)]"/>
<xsl:call-template name="sortHier">
<xsl:with-param name="pNodes"
select="$vNewChildren"/>
<xsl:with-param name="pParents"
select="$vNewParents"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document (based on the provided one, but shuffled/unsorted):
<Catalog name="AccessoriesCatalog">
<Category Definition="AccessoriesCategory"
name="16144" id="16144">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16116" id="16116">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16126" id="16126">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16131" id="16131">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16132" id="16132">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16136" id="16136">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16139" id="16139">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="16115" id="16115">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory"
name="1532" id="1532"></Category>
<Category Definition="AccessoriesCategory"
name="16195" id="16195">
<ParentCategory>16131</ParentCategory>
</Category>
</Catalog>
produces the wanted, correct result:
<Catalog name="AccessoriesCatalog">
<Category Definition="AccessoriesCategory" name="1532" id="1532"/>
<Category Definition="AccessoriesCategory" name="16115" id="16115">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16131" id="16131">
<ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16116" id="16116">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16126" id="16126">
<ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16132" id="16132">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16136" id="16136">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16139" id="16139">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16144" id="16144">
<ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16195" id="16195">
<ParentCategory>16131</ParentCategory>
</Category>
</Catalog>
Explanation:
Recursively called named template with two parameters: the "set of current parents" (or "last found parents") and the set of the current (still not processed) nodes.
Stop condition: Either the "set of current parents" or the "set of current nodes" or both are empty. Here we output (and sort by
@name
) the remaining non-empty parameter-set.Recursive step: The immediate children of the "current parents" become the new "current parents". The rest of the "current nodes" become the new "current nodes. Copy all the current-parents or all the current-nodes if there are no current-parents left.
Update:
In comments the OP has been claiming that the solution was working on small files,
"But when I try it on the whole xml with more elements and more levels it is not working. The xml I have is about 8Mb so I can't post it here."
I asked him to provide (offline) the XML files and when I got them, I have confirmed that this solution performs without problem on both the small and the bigger (44000 lines, 700KB) files that I was provided with.
The performance on the bigger file wasn't too bad with the exception of MSXML3.
Here is the performance data for the 44000 lines file, as seen on my 8 years old (2GB RAM, 3GHz single core) PC:
MSXML3: 91 sec.
MSXML6: 6 sec.
AltovaXML (XMLSpy): 6 sec.
Saxon 6.5.4: 2 sec.
Saxon 9.1.05: 1.6 sec.
XslCompiledTransform 1.3 sec.
XQSharp: 0.8 sec.
I think your answer could be found here: http://www.programmersheaven.com/2/FAQ-XML-Sort-XML-By-Multiple-Attributes-XSLT
精彩评论