XSLT modify identity rule with string matching
I need to create an XSLT to follow two rules (in order of priority):
- Should copy the entire
/xs:schema/node()
in which/xs:schema/node()/@name
starts with "prefix_". This/xs:schema/node()
should include all the descendants and attributes. - Should create a
/xs:schema/node()
containing only the descendants with any attribute that starts with "prefix_"
The document I have follows this format
<?xml version="1.0" encoding="UTF-8"?>
<!--
this is
a really long
comment
that spans
multiple lines
-->
<!-- <!a comment > another comment -->
<!-- <!a comment > another comment -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="unqu开发者_如何学编程alified"
attributeFormDefault="unqualified">
<!-- a comment -->
<xs:node name="ABC">
<xs:node>
<xs:element/>
<xs:element attr="asdf"/>
</xs:node>
</xs:node>
<!-- <!a comment > another comment -->
<node name="DEF">
<element/>
<element attr="asdf" bttr="zxcv"/>
</node>
<!-- <!a comment > another comment -->
<node name="prefix_a">
<element/>
<element attr="asdf"/>
<element attr="prefix_attr"/>
<element battr="prefix_battr"/>
</node>
<node name="prefix_b">
<node>
<element/>
<element battr="prefix_bttr"/>
<element hattr="prefix_cattr"/>
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element attr="qwerty"/>
<element attr="zxvc"/>
<element attr="asdf"/>
<element battr="prefix_bttr"/>
<element flattr="prefix_hattr"/>
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element/>
<element attr="asdf"/>
<element shattr="prefix_shattr"/>
<element cattr="prefix_battr"/>
</node>
<!-- <!a comment > another comment -->
<node name="g">
<element attr="asdf" bttr="zxcv"/>
<element/>
</node>
</xs:schema>
The XSLT should return;
<xml>
<xs:schema>
<node name="prefix_a">
<element />
<element attr="asdf" />
<element attr="prefix_attr" />
<element battr="prefix_battr" />
</node>
<node name="prefix_b">
<node>
<element />
<element battr="prefix_bttr" />
<element hattr="prefix_cattr" />
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element battr="prefix_bttr" />
<element flattr="prefix_hattr" />
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element shattr="prefix_shattr" />
<element cattr="prefix_battr" />
</node>
</xs:schema>
I am using the following XSLT below;
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsl:namespace-alias stylesheet-prefix="xs" result-prefix="xsd"/>
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates select="xsd:schema"/>
</xsl:template>
<xsl:template match="xsd:schema">
<xs:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="unqualified"
attributeFormDefault="unqualified" version="1.0">
<xsl:apply-templates select="node()[starts-with(@name, 'prefix_')]"/>
<xsl:apply-templates select="node()[descendant::node()/@*[starts-with(., 'prefix_')]]"/>
</xs:schema>
</xsl:template>
<xsl:template match="xsd:schema/node()[starts-with(@name, 'prefix_')]">
<xsl:copy-of select="current()"/>
</xsl:template>
<xsl:template match="xsd:schema/node()[descendant::node()/@*[starts-with(., 'prefix_')]]">
<xsl:copy-of select="current()"/>
</xsl:template>
</xsl:stylesheet>
I have fixed the silly error noticed by @Dimitre.
Now, the following transform:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@*|*">
<xsl:copy>
<xsl:apply-templates select="@*|*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:schema/*[
not(starts-with(@name,'prefix_'))
and
not(.//*/@*[starts-with(.,'prefix_')])]"/>
<xsl:template match="*[
not(*)
and
not(@*[starts-with(.,'prefix_')])
and
not(ancestor::*[starts-with(@name,'prefix_')])
]"/>
</xsl:stylesheet>
given this input (slightly modified to cover much more complex cases):
<xml>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- a comment -->
<node name="prefix_a">
<element />
<element attr="asdf" />
<element x="y" attr="prefix_attr" />
<element battr="prefix_battr" y="x"/>
</node>
<node name="prefix_b">
<node>
<element />
<element battr="prefix_bttr" />
<element hattr="prefix_cattr" />
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element attr="qwerty" />
<element attr="zxvc" />
<element attr="asdf" />
<element battr="prefix_bttr" x="y"/>
<element flattr="prefix_hattr" y="x"/>
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element />
<element attr="asdf" />
<element shattr="prefix_shattr" />
<element cattr="prefix_battr" />
</node>
<node name="e">
<element />
<element attr="asdf" />
</node>
</xs:schema>
</xml>
produces:
<xml>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<node name="prefix_a">
<element/>
<element attr="asdf"/>
<element x="y" attr="prefix_attr"/>
<element battr="prefix_battr" y="x"/>
</node>
<node name="prefix_b">
<node>
<element/>
<element battr="prefix_bttr"/>
<element hattr="prefix_cattr"/>
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element battr="prefix_bttr" x="y"/>
<element flattr="prefix_hattr" y="x"/>
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element shattr="prefix_shattr"/>
<element cattr="prefix_battr"/>
</node>
</xs:schema>
</xml>
In the case of the nodes that which have a @name attribute starting with prefix_ you can leave the identity transform to do its work. So, you only need to override the case for elements who don't have a @name starting with prefix_.
<xsl:template match="xs:schema/node()[not(starts-with(@name, 'prefix'))]">
To copy only the descendants with any attribute that starts with "prefix_", you will also need to copy any node that may not have an attribute itself, but does itself have a descendant meeting the criteria.
<xsl:apply-templates
select="@*|
node()[descendant-or-self::*[count(@*[starts-with(., 'prefix')]) > 0]]"
mode="attr" />
So, here, you use the mode attribute when you apply the templates, so you can override the behaviour on matching the descendants too
<xsl:template match="@*|node()" mode="attr">
Here is the full XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="xs:schema/node()[not(starts-with(@name, 'prefix'))]">
<xsl:if test="descendant-or-self::*[count(@*[starts-with(., 'prefix')]) > 0]">
<xsl:copy>
<xsl:apply-templates select="@*|node()[descendant-or-self::*[count(@*[starts-with(., 'prefix')]) > 0]]" mode="attr" />
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="@*|node()" mode="attr">
<xsl:copy>
<xsl:apply-templates select="@*[starts-with(., 'prefix')]|node()[descendant-or-self::*[count(@*[starts-with(., 'prefix')]) > 0]]" mode="attr"/>
</xsl:copy>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your input XML, the output is as follows:
<xml>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<node name="prefix_a">
<element/>
<element attr="asdf"/>
<element attr="prefix_attr"/>
<element battr="prefix_battr"/>
</node>
<node name="prefix_b">
<node>
<element/>
<element battr="prefix_bttr"/>
<element hattr="prefix_cattr"/>
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element battr="prefix_bttr"/>
<element flattr="prefix_hattr"/>
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element shattr="prefix_shattr"/>
<element cattr="prefix_battr"/>
</node>
</xs:schema>
</xml>
None of the other two answers works correctly if the attribute whose value starts with "prefix_" isn't the first (and only) attribute of its parent element!
This transformation (besides correct seems to be the shortest and simplest -- no modes and no explicit conditional instructions):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match=
"node[@name[not(starts-with(.,'prefix_'))]
and
not(descendant::*[@*[starts-with(.,'prefix_')]])
]"/>
<xsl:template match=
"*[ancestor::node[@name[not(starts-with(.,'prefix_'))]]
and
not(*)
and
not(@*[starts-with(.,'prefix_')])
]"/>
</xsl:stylesheet>
when applied on the following XML document (the provided one, but extended so that some elements have two attributes -- look at <node name="c">
):
<xml>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- a comment -->
<node name="prefix_a">
<element />
<element attr="asdf" />
<element attr="prefix_attr" />
<element battr="prefix_battr" />
</node>
<node name="prefix_b">
<node>
<element />
<element battr="prefix_bttr" />
<element hattr="prefix_cattr" />
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element attr="qwerty" />
<element attr="zxvc" />
<element attr="asdf" />
<element battr="prefix_bttr" />
<element flattr="prefix_hattr" />
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element />
<element attr="asdf" />
<element shattr="prefix_shattr" />
<element cattr="prefix_battr" />
</node>
<node name="e">
<element />
<element attr="asdf" />
</node>
</xs:schema>
</xml>
produces the wanted, correct result:
<xml>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"><!-- a comment --><node name="prefix_a">
<element/>
<element attr="asdf"/>
<element attr="prefix_attr"/>
<element battr="prefix_battr"/>
</node>
<node name="prefix_b">
<node>
<element/>
<element battr="prefix_bttr"/>
<element hattr="prefix_cattr"/>
</node>
</node>
<node name="c">
<node>
<node>
<node>
<node>
<element y="z" battr="prefix_bttr"/>
<element x="y" flattr="prefix_hattr"/>
</node>
</node>
</node>
</node>
</node>
<node name="d">
<element shattr="prefix_shattr"/>
<element cattr="prefix_battr"/>
</node>
</xs:schema>
</xml>
Do note that the solutions in the other two answers lose either the element <node name="c">
and its complete subtree, or lose some of the attributes.
精彩评论