How to read and output XML processing instructions in Scala?
I'm writing a small Scala ap开发者_高级运维p that does the following:
1) Read XML/XHTML files
2) Do some minor pre-processing
3) Transform it with a XSLT stylesheet, if required.
4) Post-process it slightly.
5) Save it as XHTML.
My XML files would start with something like:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../xslt/default.xslt"?>
If I read them with
scala.xml.XML.load(scala.xml.Source.fromFile(file))
I get an Elem, but I loose the XML processing instructions. I resorted to reading it as String, doing String manipulation to find the xml-stylesheet, and then passing it to
scala.xml.XML.load(scala.xml.Source.fromString(text))
There must be a better way of doing this. I need to know which stylesheet I should use inside Scala because Scala has to call the XSLT processor, if needed.
Also, after I'm done processing them, I save them using
scala.xml.Utility.trim(transformed).buildString(true)
but the resulting document does not contain the XML declaration, nor the HTML DOCTYPE. I want to have those too.
I know that this is technically two questions, but those are basically both ends of the same problem, and I suspect that the solution to the second problem is related to the solution of the first one.
Basically, Scala's XML is inadequate for your needs. Even with xml.parsing.XhtmlParser
, which generates a Document
, you'd only get version, encoding, and dtd. You could make a contructing parser, an event parser or override XML with a custom SAXParser to get the XSLT stuff, but you'd still not be able to represent that information with Scala XML, and you'd still have to hand code your save to append that stuff back.
So I suggest you keep with one of the Java libraries that handles XSLT.
精彩评论