Why python list slice assignment eats memory?
I'm fighting a memory leak in a Python project and spent m开发者_运维技巧uch time on it already. I have deduced the problem to a small example. Now seems like I know the solution, but I can't understand why.
import random
def main():
d = {}
used_keys = []
n = 0
while True:
# choose a key unique enough among used previously
key = random.randint(0, 2 ** 60)
d[key] = 1234 # the value doesn't matter
used_keys.append(key)
n += 1
if n % 1000 == 0:
# clean up every 1000 iterations
print 'thousand'
for key in used_keys:
del d[key]
used_keys[:] = []
#used_keys = []
if __name__ == '__main__':
main()
The idea is that I store some values in the dict d
and memorize used keys in a list to be able to clean the dict from time to time.
This variation of the program confidently eats memory never returning it back. If I use alternative method to „clear” used_keys
that is commented in the example, all is fine: memory consumption stays at constant level.
Why?
Tested on CPython and many linuxes.
Here's the reason - the current method does not delete the keys from the dict (only one, actually). This is because you clear the used_keys
list during the loop, and the loop exits prematurely.
The 2nd (commented) method, however, does work as you assign a new value to used_keys
so the loop finishes successfully.
See the difference between:
>>> a=[1,2,3]
>>> for x in a:
... print x
... a=[]
...
1
2
3
and
>>> a=[1,2,3]
>>> for x in a:
... print x
... a[:] = []
...
1
>>>
Why wouldn't something like this work?
from itertools import count
import uuid
def main():
d = {}
for n in count(1):
# choose a key unique enough among used previously
key = uuid.uuid1()
d[key] = 1234 # the value doesn't matter
if n % 1000 == 0:
# clean up every 1000 iterations
print 'thousand'
d.clear()
if __name__ == '__main__':
main()
精彩评论