Load huge data to memory using python
I have to load large data to the memory,开发者_如何学C and the structure is list. How can I get another approach. thanx
process the data line by line, eg
for line in open("file"):
print "do processing of line:",line
Or if you really really want to load one whole chunk to memory, you can try memory mapping.
import mmap,os,re
bigfile = open( "file2" )
length = os.stat(bigfile.fileno()).st_size
try:
mapping = mmap.mmap(bigfile.fileno(), length, mmap.MAP_PRIVATE, mmap.PROT_READ )
except AttributeError:
mapping = mmap.mmap(bigfile.fileno(), 0, None, mmap.ACCESS_READ )
data = mapping.read(length)
# example, finding a pattern
pat =re.compile("pattern",re.M|re.DOTALL)
print pat.findall(data)
If the data is too large to fit into your computer's memory you will have to read it piece by piece. If it is not too large you may still want to do this but it might suit your needs to read it into memory in its entirety. If you edit your question to explain more about your needs and what the characteristics of the data are, then you will get much more helpful answers than this one.
Is there any structure to this data, like a big list of customer records, or is it just one big blob like an image, audio, or video data? If the former, you might want to restructure the data into a database. sqlite is included with Python as of Py2.5, and is sufficient for many data sorting and sifting tasks.
And how large is "large"? You would be surprised how much data Python can keep in memory at once. Give us some more details about your large list of data.
精彩评论