XSLT transformations on very large files

2022-12-10 06:30 问答作者：

We are using XSLT to generate reports of our data. The data is currently stored in Oracle as XML documents (not using the XMLTYPE, but normal CLOB). We select the right XML documents and create a single document:

<DATABASE>
   <XMLDOCUMENT> ... </XMLDOCUMENT>
   <XMLDOCUMENT> ... </XMLDOCUMENT>
   ...
</DATABASE>

In some cases, the complete XML document contains +100000 documents. This means that a huge XML document is loaded first into memory, causing all kinds of memory issues.

How can we prevent this from happening? We are using the XslCompiledTransform class in .NET 2.0.

I know that there are 2 forms of parsing XML documents: DOM and SAX. But as I understand this, the SAX way is not possible in combination with XSLT. The DOM parsing method forces us to load the e开发者_StackOverflow社区ntire thing into memory.

What are our options? Does it help to first write the complete document to disk? Does Oracle perform a better job on large XSLT transformations?

Depending on what kinds of transformations you want to do, STX might be an alternative to XSLT:

Streaming Transformations for XML (STX) is a one-pass transformation language for XML documents. STX is intended as a high-speed, low memory consumption alternative to XSLT, using the W3C XQuery 1.0 and XPath 2.0 Data Model. Since STX does not require the construction of an in-memory tree, it is suitable for use in resource constrained scenarios.

There is a third XML processing model called VTD-XML that overcomes most of DOM's memory issue, and natively supports XPath that you should look... XSLT support of it is on the way...

this may help. XMLMax xml editor can apply an xsl stylesheet to each fragment matching an xpath expression and write all the matching outputs to a single file, encapsulated in a user-specified root. It has no file size limitations. google xmlmax editor.

CLOB can be streamed as far as I know. Streaming that to local file system is one of the options, of course. But then you will hit the same problem as most XSLT engines do their operation on DOM. I would suggest to split the file into smaller chunks (XMLDCOUMENTs in your case). This can be done without XSLT, but just with some simple regular expression. And then run your XSLT transformation on each individual chunk. This will, of course, be slower than doing that all in memory, but will save you from memory problems if document is too large.

继续阅读：.net oracle xml xslt

XSLT transformations on very large files

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？