Performance hit using JAR files
Do you take a performance hit when packaging you classes into Jar files rather than just running the unpackaged classes? Say for example you have a large application, if many files need to be pulled fr开发者_开发百科om the archive, would this slow down your application?
No, it won't. The classes are loaded in memory and used from there.
Possibly it might slow down startup times a bit, but that is negligible. Also, if you are aggressively loading and unload classes at runtime, there might be a difference, but still negligible.
The bottom-line is: You should not unpack jars for performance reasons.
It is unlikely that loading classes from exploded directories would result in any performance benefit. If that were to be the case, then several Java applications (especially Java EE application servers) would see a performance benefit when running off exploded JARs.
A more scientific reason for the absence of a performance benefit would be the fact that a JAR is typically compressed and is accessible from a very specific set of sectors on disk, which is unlikely in the case of exploded JARs. This would also mean that there is a significant possibility of a performance hit when using classes from an exploded JAR.
Additionally, the class loading operation is usually performed only once. Unless the permanent generation goes through several cycles of loading and unloading of classes (for a poorly sized permanent generation), it is very unlikely that class loading is a factor that will account for poor performance.
the startup time may be a bit longer depending on the compression but once it's fully running there should be no performance hit
Depends on whether reading the compressed data + decompression takes more time than the time taken to read the uncompressed data, and then only at startup.
No, it won't. On a large application the effect will probably be negative. The typical I/O-overhead is higher than the gain from not uncompressing. Additionally the jars have an index, which improves speed on repeated access. The cost of traversing the directory tree in the file system is usually higher than accessing the indexed jar.
The performance problem goes away as soon as your operation systems starts to cache the files in memory. From Java 6 the JRE will check the cache first on windows. If the cache is warm, an application that takes minutes to load, will load in 10-20 seconds. The effect should be similar for jars and *.class-files in directories. Just remember to not confuse this effect with an optimization you are doing and always measure on a cold start.
There is a trade off here, but mostly, you want to use JAR files.
If you are loading just one class, you can load just it faster as a file than reading it out of a JAR file. But, you will likely spend more time finding the class, especially if it is several directories deep, and amidst numerous other classes.
If you are loading many classes, loading them exposed will be a big loss: There will be steps to open, read, and close each of the files separately, whereas you would do that only once for reading from a JAR file. (Presumably, you keep the JAR open between class loads; re-opening the JAR for each class load would be a huge problem if you are doing multiple class loads.)
In more detail, for class loading, the two strategies here are:
(1) Classes exposed on disk. (2) Classes packed in a JAR file.
For (1), there will some cost of walking the root directory down to the target class. That might be done incrementally and on demand, or it might be done initially for the entire directory tree. If very many directories need to be listed, that cost will be quite large.
Also for (1), there is the cost of opening and reading the individual classes. For one class, that is less than the cost of opening the JAR, but for many classes, the cost will eventually get to be larger, because of the extra overhead of opening the individual files.
For (2), there is an extra cost of the initial JAR open, which will read the index of entries of the JAR. That will be proportional to the number of entries in the JAR.
For (2), when reading a single resource for a class load, there will be the cost of seeking, decompressing, and verifying the class resource. Decompression generally is a wash, as compression reduces read time but adds the decompression step. Verifying is extra, but can be turned off.
Net of this, for (1), if the entire directory is read, that will have a much larger cost than, in (2), for opening the JAR. In (1), if somehow you incrementally read the directory, that might be initially quicker, but the cost of reading the individual classes would eventually be greater than the JAR read cost of (2). "Eventually" would probably not be very many class loads.
For comparing the two scenarios, there are these additional issues which bear consideration:
(1) The overhead of having the files on disk is much higher than of having just the one JAR file in terms of raw storage and file system overhead.
(2) Accessing classes as individual files will use up multiple file handles; accessing classes from a JAR will use just one.
(3) Transporting the classes becomes a chore, as you have to do a directory copy, or do pack, copy, and unpack. Copying many individual files is much much more expensive than copying the single JAR file.
(4) Using JARs lets you use the JAR signing features which are built into classloading APIS.
In the space of application servers, the impact of the choice between (1) and (2) shows itself when looking at how classes are packaged beneath WAR files. That is, classes can be packaged under WEB-INF/classes, or in JAR files under WEB-INF/lib. If the application server unpacks WAR files for access from the running server, having classes under WEB-INF/classes adds very large overheads. Application servers generally prefer the classes be kept in JAR files.
精彩评论