Overhead of times() system call - relative to file operations

2023-01-05 12:57 问答作者：

What is the relative overhead of calling times() versus file operations like reading a line fread().

I realize this likely differs from OS to OS and depends on how long the line is, where the file is located, if it's really a pipe that's blocked (it's not), etc.

Most likely the file is not local but is on a mounted NFS drive somewhere on the local network. The common case is a line that is 20 characters long. If it helps, assume Linux kernel 2.6.9. The code will not be run on Windows.

I'm just looking for a rough guide. Is it on the same order of magnitude? Faster? Slower?

Ultimate goal: I'm looking at implementing a progress callback routine, but don't want to call too frequently (because the callback开发者_StackOverflow中文版 is likely very expensive). The majority of work is reading a text file (line by line) and doing something with the line. Unfortunately, some of the lines are very long, so simply calling every N lines isn't effective in the all-too-often seen pathological cases.

I'm avoiding writing a benchmark because I'm afraid of writing it wrong and am hoping the wisdom of the crowd is greater than my half-baked tests.

fread() is a C library function, not a system call. fread(), fwrite(), fgets() and friends are all buffered I/O by default (see setbuf) which means that the library allocates a buffer which decreases the frequency with which read() and write() system calls need to be made.

This means that if you're reading sequentially from the file, the library will only issue a system call every, say, 100 reads (subject to the buffer size and how much data you read at a time).

When the read() and write() system calls are made, however, they will definitely be slower than calling times(), simply due to the volume of data that needs to be exchanged between your program and the kernel. If the data is cached in the OS's buffers (e.g. it was written by another process on the same machine moments ago) then it will still be pretty fast. If the data is not cached, then you will have to wait for I/O (be it to the disk or over the network), which is very slow in comparison.

If the data is coming fresh over NFS, then I'd be pretty confident that calling times() will be faster than fread() on average.

On Linux, you could write a little program that does lots of calls to times() and fread() and measure the syscall times with strace -c

e.g

for (i = 0; i < RUNS; i++) {
        times(&t_buf);
        fread(buf,1,BUF,fh);
}

This is when BUF 4096 (fread will actually call read() every time)

# strace -c ./time_times
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 59.77    0.001988           0    100000           read
 40.23    0.001338           0     99999           times

and this is when BUF 16

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.00    0.001387           0     99999           times
  1.00    0.000014           0       392           read

times() simply reads kernel maintained process-specific data. The data is maintained by the kernel to supply information for the wait() system call when the process exits. So, the data is always maintained, regardless of whether times() ever gets called. The extra overhead of calling times() is really low

fread(), fwrite(), etc call underlying system calls - read() & write(), which invoke drivers. The drivers then place data in a kernel buffer. This is far more costly in terms of resources than invoking times().

Is this what you are asking?

继续阅读：c performance system

Overhead of times() system call - relative to file operations

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？