Null pattern in Python underused?
Every now and then I come across code like this:
foo = Foo()
...
if foo.bar is not None and foo.bar.baz == 42:
shiny_happy(...)
开发者_JAVA百科
Which seems, well, unpythonic, to me.
In Objective-C, you can send messages to nil and get nil as the answer. I've always thought that's quite handy. Of course it is possible to implement a Null pattern in Python, however judging from the results Google's given me, it seems this is not very widely used. Why's that?
Or even better—would it be a bad idea to let None.whatever return None instead of raising an exception?
PEP 336 - Make None Callable proposed a similar feature:
None should be a callable object that when called with any arguments has no side effect and returns None.
The reason for why it was rejected was simply "It is considered a feature that None raises an error when called."
I'm sorry, but that code is pythonic. I think most would agree that "explicit is better than implicit" in Python. Python is a language that is easy to read compared to most, and people should not defeat that by writing cryptic code. Make the meaning very clear.
foo = Foo()
...
if foo.bar is not None and foo.bar.baz == 42:
shiny_happy(...)
In this code sample, it is clear that foo.bar is sometimes None on this code path, and that we only run shiny_happy() if it is not None, and .baz == 42. Very clear to anyone what is going on here and why. The same can not be said for the null pattern, or the try ... except code in one of the answers posted here. It's one thing if your language, like Objective-C or javascript enforces a null pattern, but in a language where it is not used at all, it will just create confusion and code that is difficult to read. When programming in python, do as the pythonistas do.
Couldn't you do a try except? The Pythonic way says It is Easier to Ask for Forgiveness than Permission.
So:
try:
if foo.bar.baz == 42:
shiny_happy(...)
except AttributeError:
pass #or whatever
Or do it without possibly silencing more exceptions than desired:
try:
baz = foo.bar.baz
except AttributeError:
pass # handle error as desired
else:
if baz == 42:
shiny_happy(...)
The handiness comes at the expense of dumb mistakes not being detected at the earliest possible time, as close to the buggy line of code as possible.
I think the handiness of this particular feature would be occasional at best, whereas dumb mistakes happen all the time.
Certainly a SQL-like NULL would be bad for testing, which really banks on propositions being either true or false. Consider this code, from unittest.py:
class TestCase:
...
def failUnless(self, expr, msg=None):
"""Fail the test unless the expression is true."""
if not expr: raise self.failureException, msg
Now suppose I have a test that does this:
conn = connect(addr)
self.failUnless(conn.isOpen())
Suppose connect
erroneously returns null. If I'm using the "null pattern", or the language has it built-in, and conn
is null, then conn.isOpen()
is null, and not conn.isOpen()
is null too, so the assertion passes, even though the connection clearly is not open.
I tend to think NULL
is one of SQL's worst features. And the fact that null
silently passes for an object of any type in other languages is not much better. (Tony Hoare called null references “my billion-dollar mistake”.) We need less of that sort of thing, not more.
As others have said, PEP 336 describes why this is the behavior.
Adding something like Groovy's "safe navigation operator" (?.
) could perhaps make things more elegant in some cases:
foo = Foo()
if foo.bar?.baz == 42:
...
I don't think it's a good idea. Here's why. suppose you have
foo.getValue()
Suppose getValue()
returns a number, or None
if not found.
Now suppose foo
was none by accident or a bug. As a result, it would return None
, and continue, even if there's a bug.
In other words, you are no longer able to distinguish if there's no value (returns None
) or if there's an error (foo
was None
to begin with). You are altering the return contract of a routine with a fact that is not under control of the routine itself, eventually overwriting its semantics.
Here's why I don't think that's a good idea:
foo = Foo() // now foo is None
// if foo is None, I want to raise an exception because that is an error condition.
// The normal case is that I expect a foo back whose bar property may or may not be filled in.
// If it's not filled in, I want to set it myself.
if not foo.bar // Evaluates to true because None.bar is None under your paradigm, I think
foo.bar = 42 // Now what?
How would you handle this case?
While I wholeheartedly agree with other answers here that say that it's a good thing that None
raises an exception when asked for a member, this is a little pattern I sometimes use:
getattr(foo.bar, 'baz', default_value)
Personally I believe throwing an exception is what should happen. Lets say for some reason you are writing software for missiles in Python. Imagine the atomic bomb and lets say there was a method called explode(timeToExplode) and timeToExplode got passed in as None. I think you would be unhappy at the end of the day when you lost the war, because you didn't find this in testing.
foo = Foo()
...
if foo.bar is not None and foo.bar.baz == 42:
shiny_happy(...)
The above can be cleaned up using the fact that None resolves to False:
if foo.bar and foo.bar.baz == 42:
shiny_happy(...)
else:
not_happy(...)
I am not a python programmer (just starting to learn the language) but this seems like the never-ending discussion on when to return error or throw exception, and the fact is that (of course this is only an opinion) it depends.
Exceptions should be used for exceptional conditions, and as such move the checks for rare situations outside of the main block of code. They should not be used when a not-found value is a common condition (think of requesting the left child of a tree, in many cases --all leaves-- it will be null). There is yet again a third situation with functions that return sets of values, where you might just want to return a valid empty set to ease the code.
I believe that following the above advice, code is much more legible. When the situation is rare you do not need to worry about null (an exception will be called) so less used code is moved out of the main block.
When null is a valid return value, processing it is within the main flow of control, but that is good, as it is a common situation and as such part of the main algorithm (do not follow a null edge in a graph).
In the third case, when requesting values from a function that can possibly return no values, returning an empty set simplifies the code: you can assume that the returned container exists and process all found elements without adding the extra checks in the code.
Then again, that is just an opinion, but being mine I tend to follow it :)
Disclaimer: this is probably not a very good idea, as it isn't very readable, but I'm just adding it here for completeness.
Exploit how and
works
Here is an expression that works if you want to apply an arbitrary function transform
to a nullable variable var: Optional[T]
, returning the result of that computation if the variable is not None
, or else None
(in the spirit of Java's Optional
):
var and transform(var)
This works for any function, not just in a boolean context (e.g. the condition to an if
clause). The reason is, the and
operator is defined as follows (from the docs):
The expression
x and y
first evaluatesx
; ifx
is false, its value is returned; otherwise,y
is evaluated and the resulting value is returned.
Thus, if var
is None
, var and transform(var)
will evaluate to None
(without evaluating transform(var)
, since boolean operators in Python are short-circuiting), and if it is not, it will evaluate to the result of transform(var)
and return that.
Not quite as concise as a proper null pattern, but still less verbose than
transform(var) if var is not None else None
Example
import datetime
def null_pattern_demo(x: datetime.datetime):
return x and x.timestamp()
>>> null_pattern_demo(None)
>>> null_pattern_demo(datetime.datetime(2020, 6, 29))
1593381600.0
Chaining
In Python 3.8+, with the help of assignment expressions, you can even chain this pattern, i.e. you can compose a sequence of functions and return None
if any of the computations returned None
, or return the final result otherwise:
var and (r := f(var)) and (r := g(r)) and (r := h(r))
This will evaluate to None
as soon as any of the intermediate steps (or var
itself) returns None
. As a side effect, the final result will also be stored in r
.
Example
def f(x, y):
return 2*x if x == y else None
def g(x, y):
return x + y if y != 42 else None
def h(x):
return x**2
If we try apply f
, g
and h
composed in one go, we will get an error if one of the intermediate steps returns None
:
>>> h(g(f(2, 2), 42))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in h
TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'
But, instead, we can use the above trick:
>>> (x := None) and (x := f(x, 3)) and (x := g(x, 1)) and (x := h(x))
>>> (x := 2) and (x := f(x, 3)) and (x := g(x, 1)) and (x := h(x))
>>> (x := 2) and (x := f(x, 2)) and (x := g(x, 42)) and (x := h(x))
>>> (x := 2) and (x := f(x, 2)) and (x := g(x, 1)) and (x := h(x))
25
精彩评论