开发者

How do I pretty print an XSLT result document with removed source elements?

I have a source XHTML document with elements in multiple namespaces that I am transforming into an HTML document (obviously with no namespaces). In my XSL templates I only match elements in the XHTML namespace to remove non-HTML-compatible elements from the result tree. However, in the output, while those elements are gone, the whitespace I used to indent them remains—i.e., lines of irrelevant CR/LFs and tabs.

For example, if this is my input:

<div id="container">
    <svg:svg>
        <svg:foreignObject>
            <img />
        </svg:foreignObject>
    </svg:svg>
</div>

After applying the transformation, this will be the output:

<div id="container">


            <img />


</div>

While my desired output is this:

<div id="container">
    <img />
</div>

This happens using both TransforMiiX (attaching the stylesheet locally in Firefox) and libxslt (attaching the stylesheet server-side with PHP), so I know it's probably the result of some XSL parameter not getting set, but I've tried playing with <xsl:output indent="yes|no" />, xml:space="default|preserve", <xsl:strip-space elements="foo bar|*" />, all to no avail.

This will be implemented server-side so if there's no way to do it in raw XSL but there is a way to do it in PHP I'll accept that.

I know this is not a namespace 开发者_StackOverflow社区issue since I get the same result if I remove ANY element.


The white space you see is from the source document. XSLT default rules say that text nodes should be copied, it does not matter if they are empty or not. To override the default rule, include:

<xsl:template match="text()" />

Alternatively: Spot any <xsl:apply-templates /> (or <xsl:apply-templates select="node()" />) and explicitly specify which children you want to apply templates to. This method might be necessary if your transformation partly relies on the identity template (in which case the empty template for text nodes would be counter-productive).

I have marked up the "insignificant" white space in your snippet the way Word would do it:

<div id="container">¶
····<svg:svg>¶
········<svg:foreignObject>¶
············<img />¶
········</svg:foreignObject>¶
····</svg:svg>¶
</div>

EDIT: You can also modify your identity template like this:

<xsl:template match="node() | @*">
  <xsl:copy>
    <!-- select everything except blank text nodes -->
    <xsl:apply-templates select="
      node()[not(self::text())] | text()[normalize-space() != ''] | @*
    " />
  </xsl:copy>
</xsl:template>

This would remove any blank-only text node (attribute values remain untouched, they are not text nodes). Use <xsl:output indent="yes" /> to pretty-print the result.


You have two ways to achieve your desired result: either you fix your original transformation to handle whitespace differently, or you keep your transformation as-is and you add a second pass to prettify the output. If your original transformation is complicated then I'd recommend the 2-pass approach. You don't want to make your transformation even more complicated or you'll create some corner cases where you don't get the desired results and you'll have to add more special case handling and potentially add bugs to something that used to work, etc...

You should be able to ignore the whitespace nodes by testing them with normalize-text(). Here's how the second pass could look like. If you go with the 1-pass approach, the code will be roughly the same I guess.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes" />

    <xsl:template match="text()">
        <xsl:if test="normalize-space(.) != ''">
            <xsl:value-of select="."/>
        </xsl:if>
    </xsl:template>

    <xsl:template match="node()">
        <xsl:copy>
            <xsl:copy-of select="@*" />
            <xsl:apply-templates />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜