开发者

Use XSLT 1.0 to group XML elements into buckets, in order, based on some criteria

Say I had some XML that I wanted to convert to HTML. The XML is divided into ordered sections:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <section attr="someCriteria">
    <h1>Title 1</h1>
    <p>paragraph 1-1</p>
    <p>paragraph 1-2</p>
  </section>
  <section attr="someOtherCriteria">
    <h3>Subtitle 2</h3>
    <ul>
      <li>list item 2-1</li>
      <li>list item 2-2</li>
      <li>list item 2-3</li>
      <li>list item 2-4</li>
    </ul>
  </section>
  <section attr="anotherSetOfCriteria">
    <warning>
      Warning: This product could kill you
    </warning>
  </section>
  <section attr="evenMoreCriteria">
    <disclaimer>
      You were warned
    </disclaimer>
  </section>
  <section attr="criteriaSupreme">
    <p>Copyright 1999-2011</p>
  </section>
</root>

I have several of these XML documents. I need to group and transform these sections based on criteria. There will be two different kinds of buckets.

  • So the first section will go in a bucket (e.g.<div class="FormatOne"></div>)
  • If the second section meets the criteria to qualify for the "FormatOne" bucket it will also go in this bucket
  • If the third section requires a different bucket (e.g.<div class="FormatTwo"></div>) then a new bucket is created and section contents are placed in this bucket
  • If the bucket for the fourth section requires "FormatOne" (which is different than the previous format) then a new bucket is created again and section contents are placed in this bucket
  • etc. Each section would go into the same bucket as the previous section if they are the same format. If not, a new bucket is created.

So for each docum开发者_StackOverflow中文版ent, depending on the logic for separating buckets, the document may end up like this:

<body>
  <div class="FormatOne">
    <h1>Title 1</h1>
    <p>paragraph 1-1</p>
    <p>paragraph 1-2</p>
    <h3>Subtitle 2</h3>
    <ul>
      <li>list item 2-1</li>
      <li>list item 2-2</li>
      <li>list item 2-3</li>
      <li>list item 2-4</li>
    </ul>
  </div>
  <div class="FormatTwo">
    <span class="warningText">
      Warning: This product could kill you
    </span>
  </div>
  <div class="FormatOne">
    <span class="disclaimerText"> You were warned</span>
    <p class="copyright">Copyright 1999-2011</p>
  </div>
</body>

this:

<body>
  <div class="FormatOne">
    <h1>Title 1</h1>
    <p>paragraph 1-1</p>
    <p>paragraph 1-2</p>
    <h3>Subtitle 2</h3>
  </div>
  <div class="FormatTwo">
    <ul>
      <li>list item 2-1</li>
      <li>list item 2-2</li>
      <li>list item 2-3</li>
      <li>list item 2-4</li>
    </ul>
  </div>
  <div class="FormatOne">
    <span class="warningText">
      Warning: This product could kill you
    </span>
    <span class="disclaimerText"> You were warned</span>
    <p class="copyright">Copyright 1999-2011</p>
  </div>
</body>

or even this:

<body>
  <div class="FormatOne">
    <h1>Title 1</h1>
    <p>paragraph 1-1</p>
    <p>paragraph 1-2</p>
    <h3>Subtitle 2</h3>
    <ul>
      <li>list item 2-1</li>
      <li>list item 2-2</li>
      <li>list item 2-3</li>
      <li>list item 2-4</li>
    </ul>
    <span class="warningText">
      Warning: This product could kill you
    </span>
    <span class="disclaimerText"> You were warned</span>
    <p class="copyright">Copyright 1999-2011</p>
  </div>
</body>

depending on how the sections are defined.

Is there a way to use an XSLT to perform this type of grouping magic?

Any help would be great. Thanks!


I came up with a solution that involves hitting each section sequentially. The processing of each section is broken into two parts: a "shell" and a "contents" portion. The "shell" is responsible for rendering the <div class="FormatOne">...</div> bits, and the "contents" is responsible for rendering the actual contents of the current section and all following sections until a non-matching section is found.

When a non-matching section is found, control reverts to the "shell" template for that section.

This gives an interesting bit of flexibility: the "shell" templates may be very aggressive in what they match, and the "contents" sections may be more discerning. Specifically, with your first example output, you need the warning element to appear as <span class="warningText">...</span>, and this is accomplished with a more closely matching template.

All "content" templates, after rendering the contents of their current section, call a named template that looks for the "next" appropriate content section. This helps consolidate the rules for determining what qualifies as a "matching" section.

You can see a working example here.

Here is my code, built to replicate what you asked for in your first example:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" />

    <xsl:template match="/">
        <body>
            <xsl:apply-templates select="/root/section[1]" mode="shell" />
        </body>
    </xsl:template>

    <xsl:template match="section[
        @attr = 'someCriteria' or
        @attr = 'someOtherCriteria' or
        @attr = 'evenMoreCriteria' or
        @attr = 'criteriaSupreme']" mode="shell">

        <div class="FormatOne">
            <xsl:apply-templates select="." mode="contents" />
        </div>

        <xsl:apply-templates select="following-sibling::section[
            @attr != 'someCritera' and
            @attr != 'someOtherCriteria' and
            @attr != 'evenMoreCriteria' and
            @attr != 'criteriaSupreme'][1]" mode="shell" />

    </xsl:template>

    <xsl:template name="nextFormatOne">
        <xsl:variable name="next" select="following-sibling::section[1]" />
        <xsl:if test="$next[
            @attr = 'someCriteria' or
            @attr = 'someOtherCriteria' or
            @attr = 'evenMoreCriteria' or
            @attr = 'criteriaSupreme']">
            <xsl:apply-templates select="$next" mode="contents" />
        </xsl:if>
    </xsl:template>

    <xsl:template match="section[
        @attr = 'someCriteria' or
        @attr = 'someOtherCriteria']" mode="contents">

        <xsl:copy-of select="*" />

        <xsl:call-template name="nextFormatOne" />
    </xsl:template>

    <xsl:template match="section[@attr = 'evenMoreCriteria']" mode="contents">
        <span class="disclaimerText">
            <xsl:value-of select="disclaimer" />
        </span>

        <xsl:call-template name="nextFormatOne" />
    </xsl:template>

    <xsl:template match="section[@attr = 'criteriaSupreme']" mode="contents">
        <p class="copyright">
            <xsl:value-of select="p" />
        </p>

        <xsl:call-template name="nextFormatOne" />
    </xsl:template>

    <xsl:template match="section[@attr = 'anotherSetOfCriteria']" mode="shell">
        <div class="FormatTwo">
            <xsl:apply-templates select="." mode="contents" />
        </div>
        <xsl:apply-templates select="
            following-sibling::section[@attr != 'anotherSetOfCriteria'][1]"
            mode="shell" />
    </xsl:template>

    <xsl:template name="nextFormatTwo">
        <xsl:variable name="next" select="following-sibling::section[1]" />
        <xsl:if test="$next[@attr = 'anotherSetOfCriteria']">
            <xsl:apply-templates select="$next" mode="contents" />
        </xsl:if>
    </xsl:template>

    <xsl:template
        match="section[@attr = 'anotherSetOfCriteria']"
        mode="contents">

        <span class="warningText">
            <xsl:value-of select="warning" />
        </span>

        <xsl:call-template name="nextFormatTwo" />
    </xsl:template>

</xsl:stylesheet>


"Each section would go into the same bucket as the previous section if they are the same format. If not, a new bucket is created."

What you have described is essentially the task performed by the instruction

<xsl:for-each-group group-adjacent="....">

in XSLT 2.0. This assumes that you can write a function which translates your "criteria" into a bucket-name, and call this function within the group-adjacent attribute.

So, how hard a constraint is it that you need to use XSLT 1.0?

If you are stuck with 1.0, then you'll have to go with the sibling-recursion design pattern as proposed by Chris Nielsen.


Check out template, if, choose and for-each:

  • http://www.w3schools.com/xsl/xsl_templates.asp
  • http://www.w3schools.com/xsl/xsl_if.asp
  • http://www.w3schools.com/xsl/xsl_choose.asp
  • http://www.w3schools.com/xsl/xsl_for_each.asp

These (as well as other XSLT elments) will allow you to make conditional behavior to switch between different transform logic.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜