Advantages of mmap vs fileinput
I read that mmap is advantageous than fileinput, because it will read a page into kernel pagecache and shares the page in user address space. Whereas, fileinput actually brings a page into kernel and copies a line to user address space. So, there is this extra space overhead with fileinput.
So, I am planning to move to mmap, but I want to know from advanced python hackers whether it improves performance?
If so, is there a similar implementation o开发者_如何学JAVAf fileinput that uses mmap?
Please point me to any opensource code, if you are aware of.
thank you
mmap takes a file and sticks it in RAM so that you can index it like an array of bytes or as a big data structure.
Its a lot faster if you are accessing your file in a "random-access" manner -- that is doing a lot of fseek(), fread(), fwrite() combinations.
But if you are just reading the file in and processing each line once (say), then it is unlikely to be significantly faster. In fact, for any reasonable file size (remember with mmap it all must fit in RAM -- or paging occurs which begins to reduce the efficiency of mmap) it probably is indistinguishable.
精彩评论