Java: check for null or allow exception handling
I'm wondering about the cost of using a try/exception to handle nulls compared to using an if statement to check for nulls first.
To provide more information. There's a > 50% chance of getting nulls, because in this app. it is common to have a null if no data has been entered... so to attempt a calculation using a null is commonplace.
This being said, would it improve performance if I use an if statement to check for null first before calculation and just not attempt the calculation in the first place, or is less expensive to just let the exception be thrown and handle it?
thanks for any suggestions :-)
Thanks for great thought provoking feedback! Here's a PSEUDOcode example to clarify the original question:
BigDecimal value1 = null //assume value1 came from DB as null
BigDecimal divisor = new BigDecimal("2.0");
try{
if(value1 != null){ //does this enhan开发者_如何学Goce performance?... >50% chance that value1 WILL be null
value1.divide(divisor);
}
}
catch (Exception e){
//process, log etc. the exception
//do this EVERYTIME I get null... or use an if statement
//to capture other exceptions.
}
I'd recommend checking for null and not doing the calculation rather than throwing an exception.
An exception should be "exceptional" and rare, not a way to manage flow of control.
I'd also suggest that you establish a contract with your clients regarding input parameters. If nulls are allowed spell it out; if they're not, make it clear what should be passed, default values, and what you promise to return if a null value is passed.
If passing null
argument is an exceptional case, then I'd throw a NullPointerException
.
public Result calculate(Input input) {
if (input == null) throw new NullPointerException("input");
// ...
}
If passing null
is an allowed case, then I'd skip the calculation and eventually return null
. But that makes in my opinion less sense. Passing null
in first instance would seem a bug in the calling code.
Whatever way you choose, it should be properly documented in the Javadoc to avoid surprises for the API user.
Try and catch are close to "free" but throws can be very expensive. Typically VM creators do not optimize exception paths since, well, they are exceptional (supposed to be rare).
NullPointerException indicates a programmer mistake (RuntimeException) and should not be caught. Instead of catching the NullPointerException you should fix your code to cause the exception not to be thrown in the first place.
Worse yet, if you catch the NullPointerException and a different part of the calculation code throws NullPointerException than the one you expect to throw it you have now masked a bug.
To fully answer your question, I would implement it one way, profile it, and then implement it the other way and profile it... then I would only use the one with the if statement that avoids throwing the NullPointerException regardless of which is faster simply because of the point above.
If there's a >50% chance of getting a null, then it's hardly an exception?
Personally, I'd say that if you expect something to happen, you should code for it appropriately - in this case, checking for null and doing whatever is appropriate. I've always understood throwing an exception to not be exceedingly cheap, but couldn't say for certain.
I agree with most of the other responses that you should prefer the null check to the try-catch. And I've upvoted some of them.
But you should try to avoid the need as much as possible.
There's a > 50% chance of getting nulls, because in this app. it is common to have a null if no data has been entered... so to attempt a calculation using a null is commonplace.
That's what you should really be fixing.
Come up with sensible default values that ensure the computation works or avoid calling a computation without supplying the needed input data.
For many of the standard data types and computations involving them there are sensible default values. Default numbers to 0 or 1 depending on their meaning, default strings and collections to empty, and many computations just work. For more complex objects of your own making, consider the Null Object pattern.
If you have any case where a result or input can't be handled by you program then that should be an error. You should know what you program can handle and allow only that. In regards to possible future cases where a result could be handled but isn't yet I would still suggest considering it an error until you actually need that result. If in doubt, you don't know, the program doesn't know, it can't be handled, you haven't equipped your program to handle it so it's an error. The program can't do anything more but to stop.
For user input it is a very bad idea to rely on your program to eventually crash. You don't know when or even if it will crash. It might just end up doing the wrong thing or do ten things then crash that it shouldn't have done and wouldn't have done if input had been validated.
In terms of guarding against your own mistakes that is more of a mixed bag. You'll want to focus a lot more on making sure things work by testing, eliminating unknowns, proof reading, making sure you know exactly how your program works, etc. Still you'll occasionally have cases where internal processing might produce undesirable results.
When it turns out a result isn't an error for a given case you do not handle the exception, you handle null, not an exception. In that case the result is not an error. So you handle the result and not the error. It could not be simpler. If you're catching an exception and then doing something that can be done with an if such as:
try
extra = i_need_it(extra_id)
show('extra', extra)
catch
show('add_extra')
Then that is not right. You have a perfectly acceptable course of action if you don't have the extra thing.
This is much better and it keeps your intention clear without the extra verbosity:
Something extra = i_want_it(extra_id)
if extra ==== null
show('add_extra')
else
show('extra', extra)
Notice here you need nothing special to avoid catching an exception from another layer. How I put try catch above is a bad practice. You should only be wrapping the thing that throws an exception:
Something extra
try
extra = i_need_it(extra_id)
if extra === null
show('add_extra')
else
show('extra', extra)
When you thing about it like that then it is just converting null to exception and then back again. This is Yo-Yo coding.
You should start with:
Object i_need_it(int id) throws
Until you are actually able to implement handling for null. If you're able to implement handling for the exception you can implement the handling for the null.
When it turns out that something isn't always needed either add this method or change i_need_it to it (if null is always handled):
Object|null i_want_it(int id)
An alternative is to check is it exists first:
bool does_it_exist(int id)
The reason this isn't done so often is because it usually comes out like this:
if(does_it_exist(id))
Something i_need = i_need_it(id)
This tends to be more prone to concurrency problems, can require more calls that might be unreliable (IO over network) and can be inefficient (two RTTs rather than one). Other calls are often merged like this such as update if exists, insert if unique, update if exists or insert, etc that then return what would normally be the result of instead initially checking. Some of these have conflicts over payload size efficiency and RTT efficiency which can also vary based on a number of factors.
It is cheaper however when you need alternating behaviour based on if something exists or not but you don't need to work on it. If you also don't need to worry about the above concerns it's a bit clearer.
You may even want:
void it_must_exist(int id) throws
This is again useful because if you need only ensure something exists it's often cheaper than getting it. However it's rare you'll need this as in most cases you'll want to know if something exists so to directly work on it.
A way to conceive it is that you wouldn't make 'does_it_exist' throw an exception when it can simply return a boolean explicitly up the call stack, 'i_want_it' is a combined 'has' and 'get' in effect.
While having two separate methods more clearly separates method signatures, sometimes you may need to pass down from something else and the simplest way for that if you don't mine a bit of ambiguity is:
Object|null get(int id, bool require) throws
This is better as you're handing the contract down the call chain rather than building on a house of sand based on action at a distance. There are ways to pass down your contract more explicitly but it tends to be convoluted YAGNI (IE, pass down a method caller).
You should throw exceptions early and you can want to be safe rather than sorry so false positives are fine. Once you discover it's a false positive though then you fix it at the source.
It should be extremely rare that you're handling exceptions at all. The sheer majority should hit the top, then invoke a logging and output handler. The rest you fix appropriately by passing back the result directly and handling it. When you have one false positive out of many uses, only this that use. Don't just remove the check in the root and break the many other cases where it's still an error.
Java is an unfortunate language because I you can't have a way of saying don't pass null or this variable must be non-null.
When such a feature is lacking, It's often best to check for nulls at their sources, things such as IO rather than for every time something is passed to one of your methods. Otherwise that's an absurd amount of null checking.
You can apply this pattern to create functions to replace your ifs for parameter checking if you really need that. You would replace id with the object itself such as:
Object i_want(Object it) throws
if(it === null)
throw
return it;
Then:
void give_me(Object it)
this.it = Lib<Object>::i_want(it)
A shame there's no passthru type.
Or:
void i_get_it(Getter<Object> g)
this.it = Lib<Object>::i_want(g.gimme())
You might have a specific Getter rather than with generic.
Or:
void i_need_it(Result<Object> r)
this.it = r.require()
If you only do it on the boundary instead (when you call things out side of your control where non-null result can't be guaranteed), while it is preferred to do it there or for any usage of a method documented as returning null and only there as that's where it's really only needed, that does mean that when you do get a null where it doesn't belong (mistakes happen) you're not going to have an easy time finding out where it came from. It can get passed around a lot from IO and not produce a null pointer exception until something tries to operate on it. Such nulls can be passed around the system for days or even months as a ticking time bomb waiting to go off.
I wouldn't do it myself but I wouldn't blame people in some cases for implementing the defensive programming approach above which might be required due to Java's broken type system which is loose by default and can't be restricted. In robustly typed languages, null isn't permitted unless you explicitly say it is.
Please be advised that although I call it broken, I have been typically using significantly looser languages heavily for decades to build large, complex and critical systems without having to litter the codebase with superfluous checks. Discipline and competence are what determine quality.
A false positive is when a result or a condition occurs that you assume is a result that can't be handled by all callers but it turns out that at least one caller can handle it appropriately. In that case you don't handle the exception but instead give the caller the result. There are very few exceptions to this.
Java 8 has Optional but it doesn't really look helpful. It's a horrific case of the inner platform effect trying to implement new syntax as classes and ending up having to add half of the existing Java syntax along with it making it very clunky. As usual modern Java OOP, solves every problem of people wanting less fudge by adding more fudge and over complicating things. If you really want chaining like that you might want to try something such as kotlin which implements that syntax directly.
A modern language will mean you don't have to worry about most of this and just have:
void i_need_it(Object it)
this.it = it
void i_want_it(Object? it)
this.it = it
A modern language might even return the original object for a method without a return (replace void with self as the standard and auto-return) so people can have their fancy chaining or whatever else is fashionable these days in programming.
You can't have a factory with a base class that gives you a Null or NotNull either because you'll still have to pass the base class and that'll be a type violation when you say you want NotNull.
You might want to play around with aspects, static analysis, etc although that's a bit of a rabbit hole. All documentation should indicate if null can be returned (although if indirectly the it can potentially be left out).
You can make a class such as MightHave to wrap your result where you can put on methods like get, require and has if you don't like statics but want to eat an if although it's also in the realm of mildly convoluted and messing with all of your method signatures everywhere boxing everything all over the place, an ugly solution. It's only really handy as an alternative to those rare exception handling cases where exceptions do seem useful due to the complexity of the call graph and the number of unknowns present across layers.
One exceptionally rare case is when your source knows what exception to throw but not if it needs to be thrown but it's hard to pass down (although coupling two layers at a distance where anything can happen in between needs to be approached with caution). Some people might also want this because it can easily give a trace of both where the missing item came from and where it was required which is something using checks might not give (they're likely to fail close to the source but not guaranteed). Caution should be taken as those kinds of problems might be more indicative of poor flow control or excessive complexity than an issue with justified polymorphism, meta/dynamic coding and the like.
Caution should be taken with things such as defaults or the Null Object pattern. Both can end up hiding errors and becoming best guesses or a cure worse than the disease. Things such a NullObject pattern and Optional can often be used to simply turn off or rather bi-pass the inbuilt error handling in your interpreter.
Defaults aren't always bad but can fall into the realm of guessing. Hiding errors from the user end up setting them up to fail. Defaults (and sanitisation) always need to be thought out carefully.
Things such as NullObject pattern and Optional can be overused to effectively turn off null pointer checking. It simply makes the assumption that null is never an error which ends up with programs doing somethings, not others but you know not what. In some cases this might have hilarious results such as if the user's credit card is null. Make sure if you're using such things you're not using them to the effect of simply wrapping all of your code in a try catch that ignores a null pointer exception. This is very common because people tend to fix errors where the exception was thrown. In reality the real error tends to be further away. You end up with one piece of faulty IO handling that erroneously returns null and that null gets passed all around the program. Instead of fixing that one source of null, people will instead try to fix it in all the places it reaches where it causes an error.
NullObject pattern or MagicNoop patterns have their place but are not for common use. You shouldn't use them until it becomes immediately apparent they are be useful in a justified manner that isn't going to cause more problems than it solves. Sometimes a noop is effectively an error.
精彩评论