Sum of DOM elements using XPath
I am using MSXML v3.0 in a VB 6.0 application. The application calculates sum of an attribute of all nodes using for each loop as shown below
Set subNodes = docXML.selectNodes("//Transaction")
For Each subNode In subNodes
total = total + Val(subNode.selectSingleNode("Amount").nodeTypedValue)
Next
This loop is taking too much time, sometime it takes 15-20 minutes f开发者_开发知识库or 60 thousand nodes. I am looking for XPath/DOM solution to eliminate this loop, probably
docXML.selectNodes("//Transaction").Sum("Amount")
or
docXML.selectNodes("Sum(//Transaction/Amount)")
Any suggestion is welcomed to get this sum faster.
// Open the XML. docNav = new XPathDocument(@"c:\books.xml");
// Create a navigator to query with XPath. nav = docNav.CreateNavigator();
// Find the sum // This expression uses standard XPath syntax. strExpression = "sum(/bookstore/book/price)";
// Use the Evaluate method to return the evaluated expression. Console.WriteLine("The price sum of the books are {0}", nav.Evaluate(strExpression));
source: http://support.microsoft.com/kb/308333
Any solution that uses the XPath //
pseudo-operator on an XML document with 60000+ nodes is going to be quite slow, because //x
causes a complete traversal of the tree starting at the root of the document.
The solution can be speeded up significantly, if a more exact XPath expression is used, that doesn't include the //
pseudo-operator.
If you know the structure of the XML document, always use a specific chain of location steps -- never //
.
If you provide a small example, showing the specific structure of the document, then many people will be able to provide a faster solution than any solution that uses //
.
For example, if it is known that all Transaction
elements can be selected using this XPath expression:
/x/y/Transaction
then the evaluation of
sum(/x/y/Transaction/Amount)
is likely to be significantly faster than Sum(//Transaction/Amount)
Update:
The OP has revealed in a comment that the structure of the XML file is quite simple.
Accordingly, I tried with a 60000 Transaction
nodes XML document the following:
/*/*/Amount
With .NET XslCompiledTransform (Yes, I used XSLT as the host for the XPath engine) this took 220ms (milliseconds), that means 0.22 seconds, to produce the sum.
With MSXML3 it takes 334 seconds.
With MSXML6 it takes 76 seconds -- still quite slow.
Conclusion: This is a bug in MSXML3 -- try to upgrade to another XPath engine, such as the one offered by .NET.
精彩评论