Reading a large file line by line or storing its lines in an array
I have a large file, 100,000 lines. I can read each line and process it, or I can store the lines in an array then process them. I would prefer to use the array for extra features, but I'm really concerned about the memory usage associated with storing that many lines in 开发者_开发问答an array, and if it's worth it.
There are two functions you should familiarize yourself with.
The first is file()
, which reads an entire file into an array, with each line as an array element. This is good for shorter files, and probably isn't what you want to be using on a 100k line file. This function handles its own file management, so you don't need to explicitly open and close the file yourself.
The second is fgets()
, which you can use to read a file one line at a time. You can use this to loop for as long as there are more lines to process, and run your line processing inside the loop. You'll need to use fopen()
to get a handle on this file, you may want to track the file pointer yourself for recovery management (i.e. so you won't have to restart processing from scratch if something goes sideways and the script fails), etc.
Hopefully that's enough to get you started.
How about a combination of the two? Read 1000 lines into an array, process it, delete the array, then read 1000 more, etc. Monitor memory usage and adjust how many you read into an array at a time.
精彩评论