How can fprintf block in Linux?
I have a fprintf
call which blocked for about unreasonable 10 seconds when the system is IO busy but still has lots of CPU. I ne开发者_Python百科ither setvbuf
of the underlying stream nor opened the underlying fd with O_DIRECT
. This means the stream has both stdio stream and system cache.
I cannot tell how can a fprintf
call get blocked for such a long period. For the worst case in which the stream buffer of the underlying stream is full, I think the libc just call write(2)
to write the buffer content to disk. But to my knowledge, write(2)
on an fd that is not opened with O_DIRECT
will not wait for the disk transfer, i.e. it is asynchronous as to disk IO. So the only time consuming work I could think of is the kernel's allocating cache for the written data, but this does not seem like a 10-second work even when the system is lack of memory. In fact, the system has tens of MB of free memory and several GB of memory cached.
Is there any advice?
Thanks.
Did you run your code on more than one machine? If so, the behaviour is the same?
If not, check for disk access errors in your system's log, once I had a similar problem and the reason was a faulty disk.
A write call is not asynchronous, ie the system call must return from the kernel. Of course this system call, is mainly putting data on a I/O queue, that will do the real transfer later. However, I guess this I/O queue may be inacessible for quite some time.
This is independent of the media you are writing to, as long as it is a block device. I have a program writing streams of JPEG images to an SD card. This program will get stuck because of operations taking place on the hard disk.
However, if this is a problem with the write call blocking for ten seconds, you can try to strace your program. I don't know how or if it will disturb the behaviour you try to observe
strace -T -e trace=write progname
If you want to monitor your log activity, may be you can redirect your log write to a pipe or a named pipe, put a process that is essentially reading from this pipe and writing to your logfile, and strace it.
You said you didn't call setvbuf()
on the stream, but what kind of buffering is the stream using? Unless the stream you're using is stderr
(or sometimes stdout
), it will be block buffered by default, which means that no writes will happen until the buffer is full (typically 4096 bytes or more). Depending on how frequently you are writing logs, this could cause your perceived 10 second delay.
So I would try: setvbuf(my_stream, NULL, _IONBF, 0);
EDIT: ...Or, it's possible I completely misinterpreted the word "blocked". Leaving anyway, just in case.
If your process is writing large amounts of data to disk, it will create many dirty pages in the page cache, and write()
will eventually force your process to do synchronous writeback for some of those pages. In this situation, the kernel won't return back to your process until it's written out a reasonable amount of data to disk (10 seconds still sounds a little high, though).
精彩评论