开发者

How to use the "about:" protocol of HTML5 in XSLT processors

The HTML5 draft specifies (at the moment at least), that the URI about:legacy-compat开发者_如何学C can be used for documents, that rely on an XML conforming doctype (which <!DOCTYPE html> isn't).

So I happen to have a bundle of HTML5-validating XML files, that start with:

<!DOCTYPE html SYSTEM "about:legacy-compat">

Unfortunately, when I use such an XHTML5 document with any XSLT processor like Xalan or Saxon, they naturally try to resolve the (unresolvable) URI.

Is there any way to bring them into ignoring the URI or faux-resolving it under the hood? The try to resolve it happens early in these documents, so for example Saxon's -dtd:off switch has no effect here.

Edit: The low-level approach sed -n '2,$p' <htmlfile> | otherapp unfortunately only works until I start to use the document() XPath function to load another XHTML5 file.

Edit 2: I played around with XML catalogs and got them to work with both Saxon and Xalan. However, then I get always a

java.net.MalformedURLException: unknown protocol: about

Well, it's not surprising, but how can I circumvent this? The URL should never be parsed, just thrown away.


Put this Java file into $somepath/foo/about/

package foo.about;

import java.io.IOException;
import java.io.InputStream;
import java.io.StringBufferInputStream;
import java.net.URL;
import java.net.URLConnection;

public class Handler extends java.net.URLStreamHandler {

@Override
protected URLConnection openConnection(URL url) throws IOException  {               
    URLConnection res = new URLConnection(url) {

        @Override
        public void connect() throws IOException {
            connected = true;
        }
        @Override
        public InputStream getInputStream() throws IOException {
            return new StringBufferInputStream("<!ELEMENT html ANY>");
        }
    };
    return res;
 }
}

Now go in $somepath and compile it:

javac foo/about/Handler.java

Add the following arguments to the JVM when calling Saxon:

-Djava.protocol.handler.pkgs=foo -cp"$somepath"

Here is a modified shell script script (for *nix system but it it very similar for Windows):

#!/bin/sh

exec java -Djava.protocol.handler.pkgs=foo -classpath /usr/share/java/saxonb.jar:"$somepath" net.sf.saxon.Transform "$@"

You may want to adapt using your local saxonb-xslt script if it doesn't work.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜