Is it Pythonic to check function argument types?
I know, type checking function arguments is generally frowned upon in Python, but I think I've come up with a situation where it makes sense to do so.
In my project I have an Abstract Base Class Coord
, with a subclass Vector
, which has more features like rotation, changing magnitude, etc. Lists and tuples of numbers will also return True for isinstance(x, Coord).
I also have many functions and methods that accept these Coord types as arguments. I've set up decorators to check the arguments of these methods. Here is a simplified version:
class accepts(object):
def __init__开发者_如何学C(self, *types):
self.types = types
def __call__(self, func):
def wrapper(*args):
for i in len(args):
if not isinstance(args[i], self.types[i]):
raise TypeError
return func(*args)
return wrapper
This version is very simple, it still has some bugs. It's just there to illustrate the point. And it would be used like:
@accepts(numbers.Number, numbers.Number)
def add(x, y):
return x + y
Note: I'm only checking argument types against Abstract Base Classes.
Is this a good idea? Is there a better way to do it without having to repeat similar code in every method?
Edit:
What if I were to do the same thing, but instead of checking the types beforehand in the decorator, I catch the exceptions in the decorator:
class accepts(object):
def __init__(self, *types):
self.types = types
def __call__(self, func):
def wrapper(*args):
try:
return func(*args)
except TypeError:
raise TypeError, message
except AttributeError:
raise AttributeError, message
return wrapper
Is that any better?
Your taste may vary, but the Pythonic(tm) style is to just go ahead and use objects as you need to. If they don't support the operations you're attempting, an exception will be raised. This is known as duck typing.
There are a few reasons for favoring this style: first, it enables polymorphism by allowing you to use new kinds of objects with existing code so long as the new objects support the right operations. Second, it streamlines the successful path by avoiding numerous checks.
Of course, the error message you get when using wrong arguments will be clearer with type checking than with duck typing, but as I say, your taste may vary.
One of the reasons Duck Typing is encouraged in Python is that someone might wrap one of your objects, and then it will look like the wrong type, but still work.
Here is an example of a class that wraps an object. A LoggedObject
acts in all ways like the object it wraps, but when you call the LoggedObject
, it logs the call before performing the call.
from somewhere import log
from myclass import A
class LoggedObject(object):
def __init__(self, obj, name=None):
if name is None:
self.name = str(id(obj))
else:
self.name = name
self.obj = obj
def __call__(self, *args, **kwargs):
log("%s: called with %d args" % (self.name, len(args)))
return self.obj(*args, **kwargs)
a = LoggedObject(A(), name="a")
a(1, 2, 3) # calls: log("a: called with 3 args")
If you explicitly test for isinstance(a, A)
it will fail, because a
is an instance of LoggedObject
. If you just let the duck typing do its thing, this will work.
If someone passes the wrong kind of object by mistake, some exception like AttributeError
will be raised. The exception might be clearer if you check for types explicitly, but I think overall this case is a win for duck typing.
There are times when you really need to test the type. The one I learned recently is: when you are writing code that works with sequences, sometimes you really need to know if you have a string, or it's any other kind of sequence. Consider this:
def llen(arg):
try:
return max(len(arg), max(llen(x) for x in arg))
except TypeError: # catch error when len() fails
return 0 # not a sequence so length is 0
This is supposed to return the longest length of a sequence, or any sequence nested inside it. It works:
lst = [0, 1, [0, 1, 2], [0, 1, 2, 3, 4, 5, 6]]
llen(lst) # returns 7
But if you call llen("foo")
, it will recurse forever until stack overflow.
The problem is that strings have the special property that they always act like a sequence, even when you take the smallest element from the string; a one-character string is still a sequence. So we cannot write llen() without an explicit test for a string.
def llen(arg):
if isinstance(arg, str): # Python 3.x; for 2.x use isinstance(arg, basestring)
return len(arg)
try:
return max(len(arg), max(llen(x) for x in arg))
except TypeError: # catch error when len() fails
return 0 # not a sequence so length is 0
It is.
"Being Pythonic" is not a well-defined concept, but it is generally understood as writing code using appropriate language constructs, not being more verbose than necessary, following Python style guide (PEP 8), and generally striving to have code that is pleasant to read. We also have the Zen of Python (import this
) as guidance.
Does putting @accepts(...)
annotation on top of your function helps or hurts readability? Probably helps, because Rule #2 says "Explicit is better than implicit"
. There is also PEP-484 which was specifically designed for exactly same purpose.
Does checking types at run time count as Pythonic? Surely, it takes a toll on the execution speed -- but the goal of Python was never to produce the most performant code possible, everything-else-be-damned. Of course fast code is better than slow, but then readable code is better than spaghetti code, maintainable code is better than hackish code, and reliable code is better than buggy. So, depending on the system you're writing, you may find that the tradeoff is worth it, and using runtime type checks is worth it.
In particular, Rule #10 "Errors should never pass silently."
may be viewed as supporting the extra type checks. As an example, consider the following simple case:
class Person:
def __init__(self, firstname: str, lastname: str = ""):
self.firstname = firstname
self.lastname = lastname
def __repr__(self) -> str:
return self.firstname + " " + self.lastname
What happens when you call it like this: p = Person("John Smith".split())
? Well, nothing at first. (This is already problematic: an invalid Person
object was created, yet this error has passed silently). Then some time later you try to view the person, and get
>>> print(p)
TypeError: can only concatenate tuple (not "str") to tuple
If you have JUST created the object, and if you're experienced Python programmer, then you'll figure out what's wrong fairly quickly. But what if not? The error message is borderline useless (i.e. you need to know the internals of the Person
class to make any use of it). And what if you did not view this particular object, but pickled it into a file, which was sent to another department and loaded few months later? By the time the error is identified and corrected, your job may already be in trouble...
That being said, you don't have to write the type-checking decorators yourself. There already exist modules specifically for this purpose, for example
- typesentry
- typeguard
- enforce
If this is an exception to the rule, it's ok. But if the engineering/design of your project revolves around type checking every function (or most of them) then maybe you don't want to use Python, how about C# instead?
From my judgment, you making a decorator for type checking generally means that you're going to be using it a lot. So in that case, while factoring common code into a decorator is pythonic, the fact that it's for type checking is not very pythonic.
There has been some talk about this because Py3k supports a function annotations of which type annotations are an application. There was also an effort to roll type checking in Python2.
I think it never took off because the basic problem you're trying to solve ("find type bugs") is either trivial to begin with (you see a TypeError
) or pretty hard (slight difference in the type interfaces). Plus to get it right, you need typeclasses and classify every type in Python. It's a lot of work for mostly nothing. Not to mention you'd be doing runtime checks all the time.
Python already has a strong and predictable type system. If we will ever see something more powerful, I hope it comes through type annotations and clever IDEs.
In addition to the ideas already mentioned, you might want to "coerce" the input data into a type that has the operations you need. For instance, you might want to convert a tuple of coordinates into a Numpy array, so that you can perform linear algebra operations on it. The coercion code is quite general:
input_data_coerced = numpy.array(input_data) # Works for any input_data that is a sequence (tuple, list, Numpy array…)
精彩评论