开发者

How to filter nodes with a certain child node

I have this XSLT stylesheet, where I'm trying to find nodes that contain an a child element (and eventually one that contains the id=".." attribute). However, the <xsl:when test="a"> never matches no matter what I did. xsltproc just never matches there, and this command line hangs indefinitely while just issuing getdateandtime all the time.

saxon9 lib/docbook/5/essays/foss-and-other-beasts-v3ll-in-one.xhtml bin/clean-up-docbook-xhtml-1.1.xslt

I'm on Mandriva Linux Cooker. Here is my stylesheet:

<xsl:stylesheet version = '1.0'
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
    >

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"
        doctype-public="-//W3C//DTD XHTML 1.1//EN"
        doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
        />

    <xsl:template match="*">
        <xsl:apply-templates mode="foo" />
    </xsl:template>

    <xsl:template mode="copy_html_ns" match="*">
        <xsl:element xmlns="http://www.w3.org/1999/xhtml" 
                     name="{local-name()}">
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates mode="foo" />
       </xsl:element>
    </xsl:template>

    <xsl:template match="*开发者_运维问答" mode="foo">
        <xsl:choose>
            <xsl:when test="a">
                <xsl:element xmlns="http://www.w3.org/1999/xhtml" 
                             name="foobar">
                    <!--
                    <xsl:attribute name="id">
                        <xsl:value-of select="a[@id]" />
                    </xsl:attribute>
                    -->
                    <xsl:copy-of select="@*" />
                    <xsl:apply-templates mode="foo" />
                </xsl:element>
            </xsl:when>
            <xsl:when test="local-name() = 'a' and @id">
            </xsl:when>
            <xsl:otherwise>
                <xsl:element xmlns="http://www.w3.org/1999/xhtml" 
                             name="{local-name()}">
                    <xsl:copy-of select="@*" />
                    <xsl:apply-templates mode="foo" />
                </xsl:element>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

</xsl:stylesheet>


Your test <xsl:when test="a"> is correct, for testing whether the context node has any child elements named a.

"However, the <xsl:when test="a"> never matches no matter what I did."

How do you know it never matches? If you provide sample input, expected output, and actual output, we can better diagnose the reason why expected output != actual output.

BTW do you know that

<xsl:template match="*">
    <xsl:apply-templates mode="foo" />
</xsl:template>

will apply the mode "foo" template to all children of every element? (which will in turn recursively apply it to all children of those children). So if you have a document

<w><x><y><z/></y></x></w>

then the element z will get processed 3 times in mode "foo". Maybe you meant that first template to be

<xsl:template match="/">
    <xsl:apply-templates mode="foo" />
</xsl:template>

I also like to put an explicit select="*" on apply-templates, just to make it easier to see what's going on. But that's a matter of preference.


Edit
[removed misleading, incorrect code parts]

since no input document and no desired output, working by your xsl above, and assumption:

You need to find / handle differently all the nodes that

  • contain a tags ==> foobar,
  • contain a tags with an id attribute ==> has-a-with-id,
  • are * tags, all the others ==> they should be just copied.

So if you have an input xml like

<?xml version="1.0"?>
<base>
    <some-child>
        <a>an a</a>
        <b>a b</b>
        <a>other a</a>
        <b>other b</b>
    </some-child>
    <some-child>
        <b>third b</b>
        <a id="blah">third a</a>
        <b>fourth b</b>
    </some-child>
    <some-child>
        <b>last b</b>
    </some-child>
</base>

your output should be

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE base PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<base xmlns="http://www.w3.org/1999/xhtml">
    <foobar>
        <a>an a</a>
        <b>a b</b>
        <a>other a</a>
        <b>other b</b>
    </foobar>
    <has-a-with-id>
        <b>third b</b>
        <a id="blah">third a</a>
        <b>fourth b</b>
    </has-a-with-id>
    <some-child>
        <b>last b</b>
    </some-child>
</base>

If this is the case, my solution would be

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/xhtml" 
        xmlns:xhtml="http://www.w3.org/1999/xhtml" 
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" 
        doctype-public="-//W3C//DTD XHTML 1.1//EN" 
        doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="/">
        <xsl:apply-templates mode="foo"/>
    </xsl:template>

    <xsl:template match="*[a[@id]]" mode="foo">
        <xsl:element xmlns="http://www.w3.org/1999/xhtml" name="has-a-with-id">
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates mode="foo"/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="*[a[not(@id)]]" mode="foo">
        <xsl:element xmlns="http://www.w3.org/1999/xhtml" name="foobar">
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates mode="foo"/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="*" mode="foo">
        <xsl:element xmlns="http://www.w3.org/1999/xhtml" name="{local-name()}">
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates mode="foo"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

I'd be also curious how to optimize this code, so if anyone has idea, please feel free to share / edit.


I'm trying to find nodes that contain an "a" child element (and eventually one that contains the id=".." attribute).

You have a mess of modes -- this is totally irrelevant to your question.

This match pattern

*[a and @id]

matches any child of the current node (when the corresponding <xsl:apply-templates> is executed) that itself has a child a and also has an id attribute.

This match pattern:

*[a and @id='someString']

matches any child of the current node (when the corresponding <xsl:apply-templates> is executed) that itself has a child a and also has an id attribute with value 'someString' .


Well, someone on Freenode's #xml channel helped me write this alternative stylesheet which works better:

<xsl:stylesheet version = '1.0'
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
    >

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"
        doctype-public="-//W3C//DTD XHTML 1.1//EN"
        doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
        />

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node()[xhtml:a/@id]">
        <xsl:copy>
            <xsl:copy-of select="xhtml:a/@id"/>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="xhtml:h3[@class='author']">
        <xsl:element name="h2">
            <xsl:copy-of select="xhtml:a/@id"/>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="xhtml:a/@id"/>

</xsl:stylesheet>

The part there with the "<xsl:template match="xhtml:h3[@class='author']">" may be safely ignored, because I added it later to fix a problem.

Regarding the output I needed to process it is a standard XHTML as generated by DocBook/XML which has some <h2> / <h3> tags with empty &;lt;a id="my_anchor_here /> elements for anchors instead of doing the right thing for <h2 id="my_anchor_here">, so I'm trying to filter it.

Here is a self-contained sample:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Open Source, Free Software and Other Beasts (version 3)</title><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"/><link rel="home" href="#index" title="Open Source, Free Software and Other Beasts (version 3)"/><link rel="next" href="#introduction" title="Introduction"/></head><body><div xml:lang="en-GB" class="article"><div class="titlepage"><div><div><h1 class="title"><a id="index"/>Open Source, Free Software and Other Beasts (version 3)</h1></div><div><div class="authorgroup"><div class="author"><h3 class="author"><span class="firstname">Shlomi</span> <span class="surname">Fish</span></h3><div class="affiliation"><div class="address"><p><br/>
                    <code class="email">&lt;<a class="email" href="mailto:shlomif@shlomifish.org">shlomif@shlomifish.org</a>&gt;</code><br/>
                    <code class="uri"><a class="uri" href="http://www.shlomifish.org/"/></code><br/>
                </p></div></div></div></div></div><div><p class="copyright">Copyright © 2004, 2006, 2011 Shlomi Fish</p></div><div><div class="legalnotice"><a id="main_legal_notice"/><p>

        This document is copyrighted by Shlomi Fish under the 
        <a class="link" href="http://creativecommons.org/licenses/by/3.0/">Creative
        Commons Attribution License (CC-by) version 3.0</a> (or at 
        your option a greater version).
    </p></div></div></div><hr/></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="introduction"/>Introduction</h2></div></div></div><p>
Many people will hear about Linux in the news, being the cool new operating
system that everyone can use free of charge. Those who become interested in
it enough or actually start working with it, will learn that it is made out
of many independent "open source" components. Now, after enough time
(perhaps very soon), they will learn that the term "free software" (where
free is free as in "free speech" and not free as in "free beer") can be
used as an alternative to the adjective "open source". But what is open
source and free software? What distinguishes them from other software that
is available to the public at no cost or is distributed as shareware?
</p><p>
Note that the terms "free software" and "open source" would be used
throughout this article to refer to the same phenomenon. I do not religiously
stick to either term.
</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="licences_and_proprietary_software"/>Software Licences and "Proprietary" Software</h2></div></div></div><p>
This section deals with the legal details of distributing software, and the
so-called licences that dictate what can be done with them.
</p><p>
Software out of being a sequence of bits, that can be transcribed to a
paper, spoken or otherwise transported is considered speech and so is
protected by the <a class="link" href="http://en.wikipedia.org/wiki/Freedom_of_speech">Freedom of Speech principle of Liberalism</a>. Thus, writing
software and distributing it are a constitutional right in most liberal
countries.
</p><p>
Nevertheless, a piece of software, as any other text, can be copyrighted.
Copyright involves making sure that the software as given to someone else
other than its originator or copyright holder will be restricted in use or
modification. An originator can outline what he believes to be a proper use
of the software in a code licence (which applies to the code) or an
"End-User License Agreement" (or EULA which applies to given binaries).
</p><p>
Proprietary software, i.e: such whose use, modification or distribution is
encumbered, was a relatively new phenomenon if you take a look at the old
history of computing. It actually started even before the time when
Microsoft, then a very small company wrote Altair Basic, and Bill Gates
published the famous (or possibly infamous) <a class="link" href="http://www.blinkenlights.com/classiccmp/gateswhine.html">"Open Letter to Altair Hobbyists"</a>. 
In fact, IBM and other companies distributed proprietary software for 
mainframe systems, a long time before the Personal Computer revolution.
</p><p>
The PC revolution, however, made the situation more critical. Soon,
computers became faster, more powerful, with larger memory, and more common
as time went by. At the moment, there are 100's of millions of Pentiums and
other computers out there, and millions of newer computers are sold each year.
</p><p>
Yet, the majority of these computers mostly run software that cannot be
modified or distributed, at least not effectively or legally. The free
software (or open-source) movement started as an anti-thesis to the
tendency of vendors to hide the details of their software from the public.
The Linux Operating System with its various components (most of which are
available to other systems as well, and are not affiliated with the Linux
kernel in particular) is the most visible showcase to this phenomena. By
installing Linux it is possible to turn an everyday personal computer into
a full fledged UNIX-based workstation or server, which is a 100% powerful GNU 
system. This can cost little if any money, and the various components of the 
operating system are all freely modifiable and can be re-distributed in their
modified form.
</p><p>
It is not the only place where free software can be used. It is in fact
possible to turn a Windows installation into a Linux-like GNU system as
well (see <a class="link" href="http://www.cygwin.com/">Cygwin</a> for instance) or run 
many native Microsoft Windows open-source programs on one's Windows 
installation.
</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="meaning_of_terms"/>Meaning of the terms</h2></div></div></div><p>
According to the <a class="link" href="http://www.gnu.org/philosophy/free-sw.html">Free Software Definition </a> free software must fulfill 4 freedoms:
</p><div class="orderedlist"><ol class="orderedlist"><li class="listitem"><p>
The freedom to run the program, for any purpose
</p></li><li class="listitem"><p>
The freedom to study how the program works, and adapt it to your needs.
Access to the source code is a precondition for this.
</p></li><li class="listitem"><p>
The freedom to redistribute copies so you can help your neighbour
</p></li><li class="listitem"><p>
The freedom to improve the program, and release your improvements to the
public, so that the whole community benefits . Access to the source code is
a precondition for this. 
</p></li></ol></div><p>
    The <a class="link" href="http://www.opensource.org/docs/definition_plain.php">Open Source definition</a> is similar, but some licences can qualify as 
    open-source and not as free
software. This is usually not an issue, because the majority of open source
software out there is free as well. Moreover, lately most of the companies
and people who have phrased their own software licences, have tried to also
get the Free Software Foundation to approve their licences as free software
in their eyes.
</p><p>
Despite common belief, selling free/open-source software is perfectly
legitimate. In fact, one can charge as much as he pleases for it.
Nevertheless, most free software is distributed for free or for very
cheaply on the Internet and other mediums. This is due to the fact that its
freely distributable nature does not give way much to sale value, so there
usually is no point in attempting to mandate a charge for selling it.
</p><p>
Another common misconception is that it sometimes cannot be modified or
customised for internal use. In fact, all free software (but not <span class="emphasis"><em>all</em></span>
open source software), can. Only when you wish to distribute it (free of
charge or commercially), you may have to distribute your changes.
(depending on the licence) The use of open source software to process
proprietary content or be processed by non-free programs is also, always
available. Thus, an open-source C compiler can be used to compile the code
of proprietary programs like the Oracle Database Server.
</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="history"/>History</h2></div></div></div><p>
This section is not a definitive overview of the history of the free
software movement. It focuses on the issues regarding the usage of the
common terms.
</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="history_unix_bsd"/>Early Days, AT&amp;T UNIX, BSD</h3></div></div></div><p>
The free software movement (before it was called this way) started
organically from individuals who distributed code they wrote under the
Public Domain or what would now be considered open source or semi-open
source licences. 
</p><p>
AT&amp;T UNIX that started at 1969 was the first showcase for this
movement. Several Bell Labs Engineers led by Ken Thompson developed UNIX
for their own use, and out of legal restrictions AT&amp;T faced, decided to
distribute it to academic organizations and other organizations free-of-charge
with the source included. (that licence did not qualify as open-source but
it was pretty close). UNIX eventually sported the C programming languages,
which enabled writing code that would run on many platforms easier, and the
UNIX sources included a C compiler that was itself written in C. Around the
early 70's the only computers capable of running UNIX were main-frames and
the so-called "mini-computers" so there initially weren't as many
installations as only large organizations could support buying computers to
deploy UNIX on. 
</p><p>
That changed as integrated circuits, and computers became cheaper and more
powerful. Very soon, cheap UNIX-based servers and workstations became
commonplace and the number of UNIX installations exploded. 
<sup>[<a id="present_day_unixes" href="#ftn.present_day_unixes" class="footnote">1</a>]</sup>
</p><p>
    Nadav Har'El has prepared <a class="link" href="http://groups.yahoo.com/group/hackers-il/message/1731">a coverage of the BSDs and early AT&amp;T UNIX
        history</a>.
</p><p>
The University of California at Berkeley (a.k.a UCB) forked its own version of 
AT&amp;T UNIX and started re-writing parts of the code, and incorporating many
changes of its own. The parts that the Berkeley developers wrote on their
own had originally been licensed UCB and kept as non-FOSS (= "free and open
source software") "All Rights Reserved" licence. The BSD system became very 
popular (perhaps even more than the AT&amp;T one).
</p><p>
When Arpanet, the predecessor to the Internet was disbanded due to inadequacy,
the Internet converted to running on top of 32-bit UNIX boxes such as
the <a class="link" href="http://en.wikipedia.org/wiki/VAX">VAX architecture by Digital
Equipment Corporation</a> (now part of Hewlett-Packard). This caused a 
merging of the UNIX culture with the Arpanet enthusiasts who exchanged code
on the Arpanet, and UNIX programmers started sharing code for various
components and add-ons of UNIX on the Internet.
</p></div></div></div></body></html>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜