Why doesn't Python have a hybrid getattr + __getitem__ built in?
I have methods that accept dicts or other objects and the names of "fields" to fetch from those objects. If the object is a dict then the method uses __getitem__
to retrieve the named key, or else it uses getattr
to retrieve the named attribute. This is pretty common in web templating languages. For example, in a Chameleon template you might have:
<p tal:content="foo.keyname">Stuff goes here</p>
If you pass in foo
as a dict like {'keyname':'bar'}
, then foo.keyname
fetches the 'keyname' key to get 'bar'. If foo
is an instance of a class like:
class Foo(object):
keyname = 'baz'
then foo.keyname
fetches the value from the keyname
attribute. Chameleon itself implements that function (in the chameleon.py26
module) like this:
def lookup_attr(obj, key):
try:
return getattr(obj, key)
except AttributeError as exc:
try:
get = obj.__getitem__
except AttributeError:
raise exc
try:
return get(key)
except KeyError:
raise exc
I've implemented it in my own p开发者_如何学编程ackage like:
try:
value = obj[attribute]
except (KeyError, TypeError):
value = getattr(obj, attribute)
The thing is, that's a pretty common pattern. I've seen that method or one awfully similar to it in a lot of modules. So why isn't something like it in the core of the language, or at least in one of the core modules? Failing that, is there a definitive way of how that could should be written?
I sort of half-read your question, wrote the below, and then reread your question and realized I had answered a subtly different question. But I think the below actually still provides an answer after a sort. If you don't think so, pretend instead that you had asked this more general question, which I think includes yours as a sub-question:
"Why doesn't Python provide any built-in way to treat attributes and items as interchangable?"
I've given a fair bit of thought to this question, and I think the answer is very simple. When you create a container type, it's very important to distinguish between attributes and items. Any reasonably well-developed container type will have a number of attributes -- often though not always methods -- that enable it to manage its contents in graceful ways. So for example, a dict has items
, values
, keys
, iterkeys
and so on. These attributes are all accessed using .
notation. Items, on the other hand, are accessed using []
notation. So there can be no collisions.
What happens when you enable item access using .
notation? Suddenly you have overlapping namespaces. How do you handle collisions now? If you subclass a dict and give it this functionality, either you can't use keys like items
as a rule, or you have to create some kind of namespace hierarchy. The first option creates a rule that is onerous, hard to follow, and hard to enforce. The second option creates an annoying amount of complexity, without fully resolving the collision problem, since you still have to have an alternative interface to specify whether you want items
the item or items
the attribute.
Now, for certain kinds of very primitive types, this is acceptable. That's probably why there's namedtuple
in the standard library, for example. (But note that namedtuple
is subject to these very problems, which is probably why it was implemented as a factory function (prevents inheritance) and uses weird, private method names like _asdict
.)
It's also very, very, very easy to create a subclass of object
with no (public) attributes and use setattr
on it. It's even pretty easy to override __getitem__
, __setitem__
, and __delitem__
to invoke __getattribute__
, __setattr__
and __delattr__
, so that item access just becomes syntactic sugar for getattr()
, setattr()
, etc. (Though that's a bit more questionable since it creates somewhat unexpected behavior.)
But for any kind of well-developed container class that you want to be able to expand and inherit from, adding new, useful attributes, a __getattr__ + __getitem__
hybrid would be, frankly, an enormous PITA.
The closest thing in the python standard library is a namedtuple(), http://docs.python.org/dev/library/collections.html#collections.namedtuple
Foo = namedtuple('Foo', ['key', 'attribute'])
foo = Foo(5, attribute=13)
print foo[1]
print foo.key
Or you can easily define your own type that always actually stores into it's dict but allows the appearance of attribute setting and getting:
class MyDict(dict):
def __getattr__(self, attr):
return self[attr]
def __setattr__(self, attr, value):
self[attr] = value
d = MyDict()
d.a = 3
d[3] = 'a'
print(d['a']) # 3
print(d[3]) # 'a'
print(d['b']) # Returns a keyerror
But don't do d.3
because that's a syntax error. There are of course more complicated ways out there of making a hybrid storage type like this, search the web for many examples.
As far as how to check both, the Chameleon way looks thorough. When it comes to 'why isn't there a way to do both in the standard library' it's because ambiguity is BAD. Yes, we have ducktyping and all other kinds of masquerading in python, and classes are really just dictionaries anyway, but at some point we want different functionality from a container like a dict or list than we want from a class, with it's method resolution order, overriding, etc.
You can pretty easily write your own dict
subclass that natively behaves this way. A minimal implementation, which I like to call a "pile" of attributes, is like so:
class Pile(dict):
# raise AttributeError for missing key here to fulfill API
def __getattr__(self, key):
if key in self:
return self[key]
else:
raise AttributeError(key)
def __setattr__(self, key, value):
self[key] = value
Unfortunately if you need to be able to deal with either dictionaries or attribute-laden objects passed to you, rather than having control of the object from the beginning, this won't help.
In your situation I would probably use something very much like what you have, except break it out into a function so I don't have to repeat it all the time.
精彩评论