popen performance in C
I'm designing a program I plan to implement in C and I have a question about the best way (in terms of performance) to call external programs. The user is going to provide my program with a filename, and then my program is going to run another program with that file as input. My program is then going to process the output of the other program.
My typical approach would be to redirect the other program's output to a file and then have my program read that file when it's done. However, I understand I/O operations are quite expensive and I would like to make this program as effi开发者_StackOverflow中文版cient as possible.
I did a little bit of looking and I found the popen
command for running system commands and grabbing the output. How does the performance of this approach compare to the performance of the approach I just described? Does popen
simply write the external program's output to a temporary file, or does it keep the program output in memory?
Alternatively, is there another way to do this that will give better performance?
On Unix systems, popen
will pass data through an in-memory pipe. Assuming the data isn't swapped out, it won't hit disk. This should give you just about as good performance as you can get without modifying the program being invoked.
popen does pretty much what you are asking for: it does the pipe-fork-exec idiom and gives you a file pointer that you can read and write from.
However, there is a limitation on the size of the pipe buffer (~4K iirc), and if you arent reading quickly enough, the other process could block.
Do you have access to shared memory as a mount point? [on linux systems there is a /dev/shm mountpoint]
1) popen
keep the program output in memory. It actually uses pipes to transfer data between the processes.
2) popen
looks IMHO as the best option for performance.
It also have an advantage over files of reducing latency. I.e. your program will be able to get the other program output on the fly, while it is produced. If this output is large, then you don't have to wait until the other program is finished to start processing its output.
The problem with having your subcommand redirect to a file is that it's potentially insecure while popen
communication can't be intercepted by another process. Plus you need to make sure the filename is unique if you're running several instances of your master program (and thus of your subcommand). The popen
solution doesn't suffer from this.
The performance of popen
is just fine as long as your don't read/write one byte chunks. Always read/write multiples of 512 (like 4096). But that does apply to file operations as well. popen
connects your process and the child process through pipes, so if you don't read then the pipe fills up and the child can't write and vice versa. So all the exchanged data is in memory, but it's only small amounts.
(Assuming Unix or Linux)
Writing to the temp file may be slow if the file is on a slow disk. It also means the entire output will have to fit on the disk.
popen
connects to the other program using a pipe, which means that output will be sent to your program incrementally. As it is generated, it is copied to your program chunk-by-chunk.
精彩评论