How does the performance of dictionary key lookups compare in Python?
How does:
dict = {}
if key not in dict:
dict[key] = foo
Compare to:
try:
dict[key]
except KeyError:
dict[key] = foo
ie, is the look up of a key in anyway faster than 开发者_如何学JAVAthe linear search through dict.keys()
, that I assume the first form will do?
Just to clarify one point: if key not in d
doesn't do a linear search through d's keys. It uses the dict's hash table to quickly find the key.
You're looking for the setdefault method:
>>> r = {}
>>> r.setdefault('a', 'b')
'b'
>>> r
{'a': 'b'}
>>> r.setdefault('a', 'e')
'b'
>>> r
{'a': 'b'}
The answer depends on how often the key is already in the dict (BTW, has anyone mentioned to you how bad an idea it is to hide a builtin such as dict
behind a variable?)
if key not in dct:
dct[key] = foo
If the key is in the dictionary this does one dictionary lookup. If the key is in the dictionary it looks up the dictionary twice.
try:
dct[key]
except KeyError:
dct[key] = foo
This may be slightly faster for the case where the key is in the dictionary, but throwing an exception has quite a big overhead, so it is almost always not the best option.
dct.setdefault(key, foo)
This one is slightly tricky: it always involves two dictionary lookups: the first one is to find the setdefault
method in the dict
class, the second is to look for key
in the dct
object. Also if foo
is an expression it will be evaluated every time whereas the earlier options only evaluate it when they have to.
Also look at collections.defaultdict
. That is the most appropriate solution for a large class of situations like this.
Try: my_dict.setdefault(key, default)
. It's slightly slower than the other options, though.
If
key
is in the dictionary, return its value. If not, insertkey
with a value ofdefault
and returndefault
.default
defaults to None.
#!/usr/bin/env python
example_dict = dict(zip(range(10), range(10)))
def kn(key, d):
if key not in d:
d[key] = 'foo'
def te(key, d):
try:
d[key]
except KeyError:
d[key] = 'foo'
def sd(key, d):
d.setdefault(key, 'foo')
if __name__ == '__main__':
from timeit import Timer
t = Timer("kn(2, example_dict)", "from __main__ import kn, example_dict")
print t.timeit()
t = Timer("te(2, example_dict)", "from __main__ import te, example_dict")
print t.timeit()
t = Timer("sd(2, example_dict)", "from __main__ import sd, example_dict")
print t.timeit()
# kn: 0.249855041504
# te: 0.244259119034
# sd: 0.375113964081
my_dict.get(key, foo)
returns foo if key isn't in my_dict. The default value is None, so my_dict.get(key)
will return None if key isn't in my_dict. The first of your options is better if you want to just add key to your dictionary. Don't worry about speed here. If you find that populating your dictionary is a hot spot in your program, then think about it. But it isn't. So don't.
精彩评论