XSLT - Using substring with copy-of to preserve inner HTML tags
I have some XML like this:
<story><p><strong开发者_运维百科>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</strong>Nulla vel mauris metus. Etiam vel tortor vel magna bibendum euismod nec varius turpis. Nullam ullamcorper, nunc vel auctor consectetur, quam felis accumsan eros, lacinia fringilla mauris est vel lectus. Curabitur et tortor eros. Duis sed convallis metus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Cras tempus quam sed enim gravida bibendum. Vestibulum magna ligula, varius in sodales eu, ultricies volutpat sem. Phasellus ante justo, vestibulum eu hendrerit a, posuere vitae est. Integer at pulvinar est.</p><p>Quisque a commodo eros. Integer tempus mi sit amet leo consectetur adipiscing. Nullam sit amet enim metus. Curabitur sollicitudin egestas arcu, at convallis enim iaculis eget. Etiam faucibus, justo sit amet lacinia consectetur, purus nunc rhoncus dui, id malesuada tortor est sed orci. Quisque eget nisi vitae mi facilisis varius. Integer fringilla eros sit amet velit vehicula commodo. </p><br /><span>And some more text here</span>
</story>
I want to do this:
<xsl:copy-of select="substring(story/node(),1,500)"/>
Here is the problem. I lose the <p>, <strong>, <br />
and other HTML tags inside the <story>
tag whenever I take the substring. Is there any way to get the first 500 characters of the story tag while keeping the inner HTML tags?
Thanks!
Here is another approach in XSLT 1.0, without having to use the node-set
extension:
<xsl:template match="@*|node()" mode="limit-length">
<xsl:param name="length"/>
<xsl:copy>
<xsl:apply-templates select="@*" mode="limit-length"/>
<xsl:call-template name="copy-nodes">
<xsl:with-param name="nodes" select="node()"/>
<xsl:with-param name="length" select="$length"/>
</xsl:call-template>
</xsl:copy>
</xsl:template>
<xsl:template match="text()" mode="limit-length">
<xsl:param name="length"/>
<xsl:value-of select="substring(., 1, $length)"/>
</xsl:template>
<xsl:template name="copy-nodes">
<xsl:param name="nodes"/>
<xsl:param name="length"/>
<xsl:if test="$length > 0 and $nodes">
<xsl:variable name="head" select="$nodes[1]"/>
<xsl:apply-templates select="$head" mode="limit-length">
<xsl:with-param name="length" select="$length"/>
</xsl:apply-templates>
<xsl:variable name="remaining" select="$length - string-length($head)"/>
<xsl:if test="$remaining > 0 and count($nodes) > 1">
<xsl:call-template name="copy-nodes">
<xsl:with-param name="nodes" select="$nodes[position() > 1]"/>
<xsl:with-param name="length" select="$remaining"/>
</xsl:call-template>
</xsl:if>
</xsl:if>
</xsl:template>
Basically this is the identity template, with copying of the child nodes offloaded to a recursive template which takes care of keeping to the maximum string length, plus a separate template for text nodes, truncating them to the maximum length.
You can invoke this for the sample input as follows:
<xsl:call-template name="copy-nodes">
<xsl:with-param name="nodes" select="story/node()"/>
<xsl:with-param name="length" select="500"/>
</xsl:call-template>
Follow-up: Splitting the story
For the follow up question of splitting the story into two pieces after the first break or paragraph end after N characters, I'll go ahead and make the simplifying assumption that you want to consider splitting only after <p>
and <br>
elements which appear as direct children under the <story>
element (and not nested at an arbitrary depth). This makes the whole problem much easier.
Here is one way to accomplish it: To get the contents of the first part, you could use a template which will process a set of sibling nodes until the maximum string length is exceeded and a br
or p
is encountered, and then stop.
<xsl:template match="node()" mode="before-break">
<xsl:param name="length"/>
<xsl:if test="$length > 0 or not(self::br or self::p)">
<xsl:copy-of select="."/>
<xsl:apply-templates select="following-sibling::node()[1]"
mode="before-break">
<xsl:with-param name="length" select="$length - string-length(.)"/>
</xsl:apply-templates>
</xsl:if>
</xsl:template>
And for the second part, you could create another template which searches for the same condition as the previous template, but outputs nothing until after that point:
<xsl:template match="node()" mode="after-break">
<xsl:param name="length"/>
<xsl:choose>
<xsl:when test="$length > 0 or not(self::br or self::p)">
<xsl:apply-templates select="following-sibling::node()[1]"
mode="after-break">
<xsl:with-param name="length" select="$length - string-length(.)"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:if test="not(self::br)"> <!-- suppress the <br/> -->
<xsl:copy-of select="."/>
</xsl:if>
<xsl:copy-of select="following-sibling::node()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
And here's how you can use those templates to split a story into two <div>
s.
<xsl:template match="story">
<xsl:copy>
<xsl:copy-of select="@*"/>
<div>
<xsl:apply-templates select="node()[1]" mode="before-break">
<xsl:with-param name="length" select="500"/>
</xsl:apply-templates>
</div>
<div>
<xsl:apply-templates select="node()[1]" mode="after-break">
<xsl:with-param name="length" select="500"/>
</xsl:apply-templates>
</div>
</xsl:copy>
</xsl:template>
There is very similar question at Get N characters introduction text with XSLT 1.0 from XHTML
Here is the XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:param name="MAXCHARS">500</xsl:param>
<xsl:template match="/body">
<xsl:apply-templates select="child::node()"/>
</xsl:template>
<xsl:template match="node()">
<xsl:param name="LengthToParent">0</xsl:param>
<!-- Get length of previous siblings -->
<xsl:variable name="previousSizes">
<xsl:for-each select="preceding-sibling::node()">
<length>
<xsl:value-of select="string-length(.)"/>
</length>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="LengthToNode" select="sum(msxsl:node-set($previousSizes)/length)"/>
<!-- Total amount of characters processed so far -->
<xsl:variable name="LengthSoFar" select="$LengthToNode + number($LengthToParent)"/>
<!-- Check limit is not exceeded -->
<xsl:if test="$LengthSoFar < number($MAXCHARS)">
<xsl:choose>
<xsl:when test="self::text()">
<!-- Output text nonde with ... if required -->
<xsl:value-of select="substring(., 1, number($MAXCHARS) - $LengthSoFar)"/>
<xsl:if test="string-length(.) > number($MAXCHARS) - $LengthSoFar">...</xsl:if>
</xsl:when>
<xsl:otherwise>
<!-- Output copy of node and recursively call template on its children -->
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates select="child::node()">
<xsl:with-param name="LengthToParent" select="$LengthSoFar"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
It works by looping through the child nodes of a node, and totalling the length of the preceding siblings up to that point. Note that the code to get the length of the preceding siblings requires the use of the node-set function, which is an extension function to XSLT 1.0. In my example I am using Microsoft Extension function.
Where a node is not a text node, the total length of characters up to that point will be the sum of the lengths of the preceding siblings, put the sum of the preceding siblings of the parent node (which is passed as a parameter to the template).
When the XSLT is applied to your input XML, the following is output:
<story>
<p>
<strong>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</strong>Nulla vel mauris metus. Etiam vel tortor vel magna bibendum euismod nec varius turpis. Nullam ullamcorper, nunc vel auctor consectetur, quam felis accumsan eros, lacinia fringilla mauris est vel lectus. Curabitur et tortor eros. Duis sed convallis metus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Cras tempus quam sed enim gravida bibendum. Vestibulum magna ligula, varius in sodales eu, ultr...
</p>
</story>
精彩评论