Should I be concerned with large number of dependencies?
I was just about to include the HtmlUnit library in a project. I unpacked the zip-file and realised that it had no less than 12 dependencies.
I've always been concerned when it comes to introducing dependencies. I suppose I have to ship all these dependencies together with the application (8.7 mb in this particular case). Should I bother checking for, say, security updates for these libraries? Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another l开发者_开发百科ibrary I need, depends on a different version of xalan?
The task HtmlUnit solves for me could be solved "manually" but that would probably not be as elegant.
Should I be concerned about this? What are the best practices in situations like these?
Edit: I'm interested in the general situation, not particularly involving HtmlUnit. I just used it here as an example as that was my current concern.
Handle your dependencies with care. They can bring you much speed, but can be a pain to maintain down the road. Here are my thoughts:
- Use some software to maintain your dependencies. Maven is what I would use for Java to do this. Without it you will very soon loose track of your dependencies.
- Remember that the various libraries have different licenses. It is not granted that a given license works for your setting. I work for a software house and we cannot use GPL based libraries in any of the software we ship, as the software we sell are closed source. Similarly we should avoid LGPL as well if we can (This is due to some intricate lawyer reasoning, don't ask me why)
- For unit testing I'd say go all out. It is not the end of the world if you have to rewrite your tests in the future. It might even be then that that part of the software is either extremely stable or maybe not even maintained no more. Loosing those is not that big of a deal as you already had a huge gain of gaining speed when you got it.
- Some libraries are harder to replace later than others. Some are like a marriage that should last the life of the software, but some other are just tools that are easily replaceable. (Think Spring versus an xml library)
- Check out how the community support older versions of the library. Are they supporting older versions? What happens when life continues and you are stuck at a version? Is there an active community or do you have the skill to maintain it yourself?
- For how long are your software supposed to last? Is it one year, five year, ten year or beyond? If the software has short time span, then you can use more to get where you are going as it is not that important to be able to keep up with upgrading your libraries.
It could be a serious issue if there isn't a active community which does maintain the libraries on long term. It is ok to use libraries, but to be honest you should care to get the sources and put them into your VCS.
Should I bother checking for, say, security updates for these libraries?
In general, it is probably a good idea to do this. But then so should everyone upstream and downstream of you.
In your particular case, we are talking about test code. If potential security flaws in libraries used only in testing are significant, your downstream users are doing something strange ...
Finally (and most importantly, actually what I'm most concerned about): What if I want to include another library which depends on the same libraries as this library, but with different versions? That is, what if for instance HtmlUnit depends on one version of xalan and another library I need, depends on a different version of xalan?
Ah yes. Assuming that you are building your own classpaths, etc by hand, you need to make a decision about which version of the dependent libraries you should use. It is usually safe to just pick the most recent of the versions used. But if the older version is not backwards incompatible with the new (for your use case) then you've got a problem.
Should I be concerned about this?
IMO, for your particular example (where we are talking about test code), no.
What are the best practices in situations like these?
Use Maven! It explicitly exposes the dependencies to the folks who download your code, making it possible for them to deal with the issue. It also tells you when you've got dependency version conflicts and provides a simple "exclude" mechanism for dealing with it.
Maven also removes the need to create distributions. You publish just your artifacts with references to their dependents. The Maven command then downloads the dependent artifacts from wherever they have been published.
EDIT
Obviously, if you are using HtmlUnit for production code (rather than just tests), then you need to pay more attention to security issues.
A similar thing has happened to me actually.
Two of my dependencies had the same 'transitive' dependency but a different version.
My favorite solution is to avoid "dependency creep" by not including too many dependencies. So, the simplest solution would be to remove the one I need less, or the one I could replace with a simple Util class, etc.
Too bad, it's not always that simple. In unfortunate cases where you actually need both libraries, it is possible to try to sync their versions, i.e. downgrade one of them so that dependency versions match.
In my particular case, I manually edited one of the jars, deleted the older dependency from it, and hoped it would still work with new version loaded from other jar. Luckily, it did (i.e. maintainers of the transitive dependency didn't remove any classes or methods that library used).
Was it ugly - Yes (Yuck!), but it worked.
I try to avoid introducing frivolous dependencies, because it is possible to run into conflicts.
One interesting technique I have seen used to avoid conflicts: renaming a library's package (if its license allows you to -- most BSD-style licenses do.) My favorite example of this is what Sun did when they built Xerces into the JRE as the de-facto JAXP XML parser: they renamed the whole of Xerces from org.apache.xerces
to com.sun.org.apache.xerces.internal
. Is this technique drastic, time consuming, and hard to maintain? Yes. But it gets the job done, and I think it is an important possible alternative to keep in mind.
Another possibility is -- license terms abided -- copying/renaming single classes or even single methods out of a library.
HtmlUnit can do a lot, though. If you are really using a large portion of its functionality on a lot of varied input data, it would be hard to make a case for spending the large amount of time it would take to re-write the functionality from scratch, or repackage it.
As for the security concerns -- you might weigh the chances of a widely used library having problems, vs. the likelihood of your hand-written less-tested code having some security flaw. Ultimately you are responsible for the security of your programs, though -- so do what you feel you must.
精彩评论