开发者

Processing cyclic dependencies with XSLT

I’m processing an 开发者_开发问答XML file that, simplified, looks something like this:

<resources>
  <resource id="a">
    <dependency idref="b"/>
    <!-- some other stuff -->
  </resource>
  <resource id="b">
    <!-- some other stuff -->
  </resource>
</resources>

The XSLT stylesheet must process a particular resource that we’re interested in, which I will call the root resource, and all recursive dependencies. Dependencies are other resources, uniquely identified by their id attribute.

It doesn’t matter if a resource is processed twice, although it’s preferable to process each required resource only once. It also doesn’t matter what order the resources are processed in.

It’s important that only the root resource and its recursive dependencies are processed. We can’t just process all the resources and be done with it.

A naïve implementation is as follows:

<xsl:key name="resource-id" match="resource" use="@id"/>

<xsl:template match="resource">
  <!-- do whatever is required to process the resource. -->

  <!-- then handle any dependencies -->
  <xsl:apply-templates select="key('resource-id', dependency/@idref)"/>
</xsl:template>

This implementation works fine for the example above, as well as in many real-world cases. It does have the disadvantage that it often processes the same resource more than once, but as stated above that’s not hugely important.

The problem is that sometimes resources have cyclic dependencies:

<resources>
  <resource id="a">
    <dependency idref="b"/>
    <dependency idref="d"/>
  </resource>
  <resource id="b">
    <dependency idref="c"/>
  </resource>
  <resource id="c">
    <dependency idref="a"/>
  </resource>
  <resource id="d"/>
</resources>

If you use the naïve implementation to process this example, and you start by processing a, b or c, you get infinite recursion.

Unfortunately I can’t control the input data and in any case cyclic dependencies are perfectly valid and allowed by the relevant specification.

I’ve come up with various partial solutions, but nothing that works in all cases.

The ideal solution would be a general approach to preventing a node from being processed more than once, but I don’t think that’s possible. In fact, I suspect this whole problem is impossible to solve.

If it helps, I have most of EXSLT available (including functions). If necessary I can also pre-process the input with any number of other XSLT scripts, although it’s preferable not to do excessive pre-processing of resources that won’t end up in the output.

What I can’t do is switch to processing this with another language (at least not without substantial re-engineering). I also can’t use XSLT 2.0.

Any ideas?


This is a simple solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:param name="pRootResourceId" select="'a'"/>

 <xsl:key name="kResById" match="resource" use="@id"/>

 <xsl:template match="/">
  <resourceProcessing root="{$pRootResourceId}">
    <xsl:apply-templates select=
    "key('kResById', $pRootResourceId)"/>
  </resourceProcessing>
 </xsl:template>

 <xsl:template match="resource">
  <xsl:param name="pVisited" select="'|'"/>

  <xsl:copy>
    <xsl:copy-of select="@*"/>

    <xsl:apply-templates select=
      "key('kResById',
           dependency/@idref
                [not(contains($pVisited, concat('|', ., '|')))])">
      <xsl:with-param name="pVisited"
       select="concat($pVisited, @id, '|')"/>
    </xsl:apply-templates>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<resources>
  <resource id="a">
    <dependency idref="b"/>
    <dependency idref="d"/>
  </resource>
  <resource id="b">
    <dependency idref="c"/>
  </resource>
  <resource id="c">
    <dependency idref="a"/>
  </resource>
  <resource id="d"/>
</resources>

the wanted, correct result is produced:

<resourceProcessing root="a">
   <resource id="a">
      <resource id="b">
         <resource id="c"/>
      </resource>
      <resource id="d"/>
   </resource>
</resourceProcessing>

The main idea is simple: Maintain a list of ids of visited resources and only allow the processing of a new resource if its id is not present in the list. The "processing" is for demonstration purposes and outputs the request wrapping all other requests (recursively), on which it depends.

Also note that every request is processed only once.

Years ago I provided a similar solution to a graph-traversal problem -- it can be found in the xml-dev group archives -- here. :)


Just for fun, another solution (following Dimitre) but increasing a node-set with visited nodes. I post two stylesheet, one with node set logic and other with node set comparison, because you must test wich is faster for big XML inputs.

So, this stylesheet:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:param name="pRootResourceId" select="'a'"/>
    <xsl:key name="kResById" match="resource" use="@id"/>
    <xsl:template match="/" name="resource">
        <xsl:param name="pVisited" select="key('kResById', $pRootResourceId)"/>
        <xsl:param name="pNew" select="key('kResById',$pVisited/dependency/@idref)"/>
        <xsl:choose>
            <xsl:when test="$pNew">
                <xsl:call-template name="resource">
                    <xsl:with-param name="pVisited" select="$pVisited|$pNew"/>
                    <xsl:with-param name="pNew" select="key('kResById',
           $pNew/dependency/@idref)[not(@id=($pVisited|$pNew)/@id)]"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <result>
                    <xsl:copy-of select="$pVisited"/>
                </result>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

And this stylesheet:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:param name="pRootResourceId" select="'a'"/>
    <xsl:key name="kResById" match="resource" use="@id"/>
    <xsl:template match="/" name="resource">
        <xsl:param name="pVisited" select="key('kResById', $pRootResourceId)"/>
        <xsl:param name="pNew" select="key('kResById', $pVisited/dependency/@idref)"/>
        <xsl:variable name="vAll" select="$pVisited|$pNew"/>
        <xsl:choose>
            <xsl:when test="$pNew">
                <xsl:call-template name="resource">
                    <xsl:with-param name="pVisited" select="$vAll"/>
                    <xsl:with-param name="pNew" select="key('kResById',
           $pNew/dependency/@idref)[count(.|$vAll)>count($vAll)]"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <result>
                    <xsl:copy-of select="$pVisited"/>
                </result>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

Both output:

(Wiht first input)

<result>
    <resource id="a">
        <dependency idref="b" />
        <!-- some other stuff -->
    </resource>
    <resource id="b">
        <!-- some other stuff -->
    </resource>
</result>

(With last input)

<result>
    <resource id="a">
        <dependency idref="b" />
        <dependency idref="d" />
    </resource>
    <resource id="b">
        <dependency idref="c" />
    </resource>
    <resource id="c">
        <dependency idref="a" />
    </resource>
    <resource id="d" />
</result>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜