matplotlib MemoryError on pcolorfast
The Problem:
I'm currently loading column data from text files into numpy arrays, and then plotting them and saving the resulting image. Because the values will always lie on an equally spaced grid, it seemed an appropriate time to use pcolorfast. Each array is necessarily square, usually between 1024x1024 and 8192x8192. At present, I'm only concerned with this working up to and including 4096x4096 sizes. This needs to be done for hundreds of files, and while it successfully completes the fist image, subsequent images crash o开发者_如何学运维n a MemoryError.
Unsuccessful solutions:
I've ensured, as per here, that I have hold = False in rc.
Limitations:
The images must be saved using all 4096x4096 values, and cannot be scaled down to 1024x1024 (as suggested here).
Notes:
After watching memory usage during each phase (create empty array, load values, plot, save), the array A is still sitting in memory after makeFrame is complete. Is an explicit call to delete it required? Does fig need to be explicitly deleted, or should pylab take care of that? The ideal situation (probably obvious) would be to have memory usage return to ~the same level as it was before the call to makeFrame().
Any and all advice is greatly appreciated. I've been trying to resolve this for a few days, so it's not unlikely I've missed something obvious. And obvious solutions would be exciting (if the alternative should be that this is a more complicated problem).
Current code sample:
import numpy
import matplotlib
matplotlib.use("AGG")
import matplotlib.pylab as plt
def makeFrame(srcName, dstName, coloring, sideLength,
dataRanges, delim, dpi):
v,V,cmap = coloring
n = sideLength
xmin,xmax,ymin,ymax = dataRanges
A = numpy.empty((n,n),float)
dx = (xmax-xmin) / (n-1)
dy = (ymax-ymin) / (n-1)
srcfile = open(srcName,'rb')
for line in srcfile:
lineVals = line[:-1].split(delim)
x = float(lineVals[0])
y = float(lineVals[1])
c = float(lineVals[2])
#Find index from float value, adjust for rounding
i = (x-xmin) / dx
if (i - int(i) ) > .05: i += 1
j = (y-ymin) / dy
if (j - int(j) ) > .05: j += 1
A[i,j] = c
srcfile.close()
print "loaded vals"
fig = plt.figure(1)
fig.clf()
ax = fig.gca()
ScalarMap = ax.pcolorfast(A, vmin = v, vmax = V, cmap = cmap)
fig.colorbar(ScalarMap)
ax.axis('image')
fig.savefig(dstName, dpi = dpi)
plt.close(1)
print "saved image"
Caveats:
- There might be a better way to deal with this memory problem that I don't know about.
- I haven't been able to reproduce this
error. When I use
matplotlib.cbook.report_memory()
my memory usage seems to level out as expected.
Despite the caveats, I thought I'd mention a general, cheap method of dealing with problems caused by a program refusing to release memory: Use the multiprocessing module to spawn the problematic function in a separate process. Wait for the function to end, then call it again. Each time a subprocess ends, you regain the memory it used.
So I suggest trying something like this:
import matplotlib.cbook as mc
import multiprocessing as mp
import matplotlib.cm as cm
if __name__=='__main__':
for _ in range(10):
srcName='test.data'
dstName='test.png'
vmin = 0
vmax = 5
cmap = cm.jet
sideLength = 500
dataRanges = (0.0,1.0,0.0,1.0)
delim = ','
dpi = 72
proc=mp.Process(target=makeFrame,args=(
srcName,dstName,(vmin,vmax,cmap),sideLength,
dataRanges,delim,dpi))
proc.start()
proc.join()
usage=mc.report_memory()
print(usage)
精彩评论