Is frozenset adequate for caching of symmetric input data in a python dict?
The title more or less says it all:
I have a function which takes symmetric input in two arguments, e.g. something like
def f(a1, a2):
return heavy_stuff(abs(a1 - a2))
Now, I want to introduce some caching method. Would it be correct / pythonic / reasonably efficient to do something like this:
cache = {}
def g(a1, a2):
fs =frozenset((tuple(a1), tuple(a2)))
if fs not in cache:
cache[fs] = f(a1, a2)
return cache[fs]
Or would there be some better way?
Edit开发者_Go百科: a1 and a2 might be the rows of a numpy array; that’s why I wrap them in a tuple each.
Python always computes all arguments you're passing to a function, and only then does it call the function. In other words, like most other languages, Python is "eager" in its evaluation (the major exception today is probably Haskell, but that doesn't help you;-).
So setdefault
is a very unsuitable approach for caching! Whenever you do
cache.setdefault(akey, f(x, y))
you're first calling f(x, y)
, with all its computational cost, then maybe dropping the results of that computation on the floor; this makes the caching totally ineffective.
Rather, always do it as follows:
akey = whatever(x, y)
if akey not in cache:
cache[akey] = f(x, y)
return cache[akey]
or the like -- there are a few other possible idioms, especially if e.g. you know that f
will never return None
:
result = cache.get(akey)
if result is None:
result = cache[akey] = f(x, y)
return result
As for the secondary issue of what's an appropriate whatever
for the key computation, given that you know that f
is symmetrical, I think frozenset
is probably OK; although (if the components of x
and y
are comparable, as well as hashable -- i.e. it wouldn't work with complex numbers) you might want to consider
ta1 = tuple(a1)
ta2 = tuple(a2)
if ta1 > ta2: key = ta1, ta2
else: key = ta2, ta1
the relative performance depends on the cost of comparing, vs that of hashing, the items in a1
and a2
. Differences are likely to be minor, anyway.
精彩评论