开发者

How to detect if a node contains significant information?

I need to find out how to detect if a node contains significant information.

The following example shows what is not considered "significant" information by me:

<node>
    <node1>&nbsp;</node1>
    </br></br>
    &nbsp;

    <node1>
        &nbsp;
        <node2></br>&nbsp;</node2>
        </br></br>
    </node1>
    <!--
   开发者_如何学C and so on...
    -->
</node>

This <node> is "empty" for me.


Here is how to do it:

<!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp "&#160;"> ]>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
   "text()
      [translate(normalize-space(), '&#160;','')
      = ''
      ]"/>
</xsl:stylesheet>

When this transformation is applied to the following XML document (the one you provided was severely malformed -- non-well formed in numerous ways!!):

<!DOCTYPE node [ <!ENTITY nbsp "&#160;"> ]>
<node>
    <node1>&nbsp;</node1>
    <br></br>
    &nbsp;

    <node1>
        &nbsp;
        <node2><br/>&nbsp;</node2>
        <br></br>
    </node1>
    <!--
    and so on...
    -->
</node>

then the wanted result is produced:

<node>
   <node1/>
   <br/>
   <node1>
      <node2>
         <br/>
      </node2>
      <br/>
   </node1><!--
    and so on...
    -->
</node>

This technique can be generalized:

You can have all whetespace-characters in an xsl:variable, then simply override the identity rule with this template:

<!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp "&#160;"> ]>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:variable name="vwhiteSpace" select="' &#x9;&#xA;&#xD;&nbsp;'"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="text()">
   <xsl:if test="translate(., $vwhiteSpace,'') != ''">
     <xsl:copy-of select="."/>
   </xsl:if>
 </xsl:template>
</xsl:stylesheet>

And you can specify all additional characters you consider "white-space" in $vwhiteSpace

Update: The OP indicated in a comment that he actually wants to see if a "node" is significant or not -- not to "clean a node".

The solution to this is already contained in my solution to the initial problem:

  <xsl:variable name="vIsSignificant" select=
     "translate(., $vwhiteSpace,'') != ''"/>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜