XSL to find all nodes between nodes
I have a large poorly formed XML file where information related to a single line item is broken into multiple lines of information that I'm trying to group with the parent line item (ITEM_ID). The information is sequential so the key is the ITEM_ID node, but I can't seem to create the proper XSL needed to group the information related to an item (ITEM_ID), given the following XML source (Updated to include newly discovered grandchild element in XML source):
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM开发者_如何学运维_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
<ADDTL_NOTE_DETAIL>
<NOTE>This is the grandchild note that sometimes appears in my data</NOTE>
</ADDTL_NOTE_DETAIL>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
...
Desired output:
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
<NOTE>This is the grandchild note that sometimes appears in my data</NOTE>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kFollowing" match="LINE_INFO[not(ITEM_ID)]"
use="generate-id(preceding-sibling::LINE_INFO[ITEM_ID][1])"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="LINE_INFO[ITEM_ID]">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
<xsl:apply-templates select="key('kFollowing', generate-id())/node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="LINE_INFO[not(ITEM_ID)]"/>
</xsl:stylesheet>
when applied on the provided XML document (wrapped in a single top element to mane it well-formed):
<t>
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
</t>
produces the wanted, correct result:
<t>
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
</t>
Do note: The use of keys to identify easily and efficiently all LINE_INFO
nodes that dont have an ITEM_ID
child and immediately follow a LINE_INFO
node with an ITEM_ID
child.
This XSLT 2.0 stylesheet:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="root">
<xsl:for-each-group select="LINE_INFO"
group-starting-with="LINE_INFO[ITEM_ID]">
<xsl:copy>
<xsl:apply-templates select="current-group()/node()"/>
</xsl:copy>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
With this input:
<root>
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
</root>
Output:
<LINE_INFO>
<ITEM_ID>some_part_num</ITEM_ID>
<DESC>some_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>some_other_part_num</ITEM_ID>
<DESC>some_other_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
<EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
<LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
<ITEM_ID>yet_another_part_num</ITEM_ID>
<DESC>yet_another_part_num_description</DESC>
<QTY>nn</QTY>
<UNIT>uom</UNIT>
</LINE_INFO>
This is a classic grouping problem. The best approach depends on whether you have XSLT 2.0, or have to use 1.0.
If 2.0, you'll want to use <xsl:for-each-group>
:
<table>
<xsl:for-each-group select="LINE_INFO" group-starting-with="LINE_INFO[ITEM_ID]">
The above XPath expressions for select
and group-starting-with
assume that the context node is the parent of the LINE_INFO elements. Alternatively you could put //
on the front of both expressions, at the risk of lesser performance.
Output a row for each group, with data put in table cell's according to your most recent comment:
<tr>
<td><xsl:value-of select="current-group()/ITEM_ID" /></td>
<td>
<xsl:value-of "concat(current-group()/DESC, current-group()/EXT_DESC)"/>
<br />
<xsl:value-of "concat(current-group()/LINE_NOTE)" />
<br />
<xsl:value-of "concat(current-group()/NOTE)" />
</td>
<td><xsl:value-of select="current-group()/QTY" /></td>
<td><xsl:value-of select="current-group()/ADDTL_NOTE_DETAIL/NOTE" /></td>
</tr>
</xsl:for-each-group>
</table>
(The rest of this answer is somewhat obsolete as the OP has XSLT 2.0.)
If 1.0, your best bet is Muenchian grouping. For the identifying-the-groups step (step 1), you would use a key like
<xsl:key name="LINE_INFO-by-section" match="LINE_INFO"
use="generate-id((. | preceding-sibling::LINE_INFO)[ITEM_ID][last()])" />
To iterate over the groups:
<xsl:for-each select="LINE_INFO[ITEM_ID]">
<xsl:copy>
To iterate over the members of the group:
<xsl:variable name="section-starter-id" select="generate-id(.)" />
<xsl:for-each select="key('LINE_INFO-by-section', $section-starter-id))">
<xsl:copy-of select="node()|@*" />
</xsl:for-each>
</xsl:copy>
</xsl:for-each>
(Untested.)
精彩评论