What can we do to make XML processing faster?
We work on an internal corporate system that has a web front-end as one of its interfaces.
The front-end (Java + Tomcat + Apache) communicates to the back-end (proprietary system written in a COBOL-like language) through SOAP web serv开发者_开发问答ices.
As a result, we pass large XML files back and forth.
We believe that this architecture has a significant impact on performance due to the large overhead of XML transportation and parsing. Unfortunately, we are stuck with this architecture.
How can we make this XML set-up more efficient?
Any tips or techniques are greatly appreciated.
Profiling!
Do some proper profiling of your system under load - there isn't really enough information to go on here.
You need to work out where the time is going and what the bottleknecks are (network bandwidth, cpu, memory etc...). Only then will you know what to do about it - many optimisations are really just trade-offs (for example caching is sacrificing memory to improve performance elsewhere)
The only thing that I can think of off-hand is making sure that you are using HTTP compression with web services - XML can usually be compacted down to a fraction of its normal size, but again this will only help if you have CPU cycles to spare.
You can compress the transfer if both ends can support that, and you can try different parsers, but since you say SOAP there aren't many choices. SOAP is bloated anyway.
I'm going to go out on a limb here and suggest GZIP Compression if you think it is due to bandwidth issues. (you mentioned XML Transportation) Yes, this would increase your CPU time, but it might speed things up in the transport.
Here's the first Google hit on GZIP Compression as a starting point. It describes how it works on Apache.
First make sure that your parsing methods are efficient for large documents. StAX is a good one for parsing large documents.
Additionally, you can take a look at binary XML approaches. These provide more efficient transport but also attempt to aid in parsing.
Try StAX. It performs well and has a nice, concise syntax.
Check if your application reads in the whole XML documents as a DOM tree. Those may get VERY big, and frequently you can do with a simple SAX event inspection or a SAX-based XSLT program (which can be compiled for fast processing).
This is very visible in a profiler like visualvm in the Sun Java 6 JDK
精彩评论