What is faster in xml parsing: elements or attributes?
I am writing开发者_如何学Go code that parses XML.
I would like to know what is faster to parse: elements or attributes.
This will have a direct effect over my XML design.
Please target the answers to C# and the differences between LINQ and XmlReader.
Thanks.
Design your XML schema so that representation of the information actually makes sense. Usually, the decision between making something in attribute or an element will not affect performance.
Performance problems with XML are in most cases related to large amounts of data that are represented in a very verbose XML dialect. A typical countermeasures is to zip the XML data when storing or transmitting them over the wire.
If that is not sufficient then switching to another format such as JSON, ASN.1 or a custom binary format might be the way to go.
Addressing the second part of your question: The main difference between the XDocument
(LINQ) and the XmlReader
class is that the XDocument
class builds a full document object model (DOM) in memory, which might be an expensive operation, whereas the XmlReader
class gives you a tokenized stream on the input document.
With XML, speed is dependent on a lot of factors.
With regards to attributes or elements, pick the one that more closely matches the data. As a guideline, we use attributes for, well, attributes of an object; and elements for contained sub objects.
Depending on the amount of data you are talking about using attributes can save you a bit on the size of your xml streams. For example, <person id="123" />
is smaller than <person><id>123</id></person>
This doesn't really impact the parsing, but will impact the speed of sending the data across a network wire or loading it from disk... If we are talking about thousands of such records then it may make a difference to your application.
Of course, if that actually does make a difference then using JSON or some binary representation is probably a better way to go.
The first question you need to ask is whether XML is even required. If it doesn't need to be human readable then binary is probably better. Heck, a CSV or even a fixed-width file might be better.
With regards to LINQ vs XmlReader, this is going to boil down to what you do with the data as you are parsing it. Do you need to instantiate a bunch of objects and handle them that way or do you just need to read the stream as it comes in? You might even find that just doing basic string manipulation on the data might be the easiest/best way to go.
Point is, you will probably need to examine the strengths of each approach beyond just "what parses faster".
Without having any hard numbers to prove it, I know that the WCF team at Microsoft chose to make the DataContractSerializer their standard for WCF. It's limited in that it doesn't support XML attributes, but it is indeed up to 10-15% faster than the XmlSerializer.
From that information, I would assume that using XML attributes will be slower to parse than if you use only XML elements.
精彩评论