Perl: performance hit with reading multiple files
I was wondering what is better in this case?开发者_如何学C
I have to read in thousands of files. I was thinking of opening into each file and reading one and closing it. Or cat all the files into one file and read that.
Suggestions? This is all in Perl.
It shouldn't make that much of a difference. This sounds like premature optimization to me.
If the time for cat
ing all files into one bigger file doesn't matter it will be faster (only when reading the file sequentially which is the default).
Of course if the process is taken into account it'll be much slower because you have to read, write and read again.
In general reading one file of 1000M should be faster than reading 100 files of 10M because for the 100 files you'll need to look for the metadata.
As tchrist says the performance difference might not be important. I think it depends on the type of file (e.g. for a huge number of files which are very small it would differ much more) and the overall performance of your system and its storage.
Note that cat *
can fail if number of files is greater than your ulimit -n
value. So sequential read can actually be safer.
Also, consider using opendir
and readdir
instead of glob
if all your files are located in the same dir.
Just read the files sequentially. Perl's file i/o functions are pretty thin wrappers around native file i/o calls in the OS, so there isn't much point in fretting about performance from simple file i/o.
精彩评论