Performance of fwrite and write size

2023-01-28 15:22 问答作者：

I'm writing out a large numerical 2 dimensional array to a binary file (final size ~75 MB).

I'm doing this on a linux system. First, is there a better method or syscall other than fwrite to write the file as fast as possible?

Second, if I should use fwrite, then should I just write the whole file as 1 contiguous line?

fwrite( bu开发者_StackOverflow中文版f, sizeof(float), 6700*6700, fp );

or write it as a series of chunks

fwrite( buf, sizeof(float), 8192, fp );
fwrite( *(buf+8192), sizeof(float), 8192, fp );
....

If I should chunk the writing, how big should each chunk be?

I agree with miked and Jerome for the most part, but... only for a modern OS. If you are working embedded on a flash file system, there are some major exceptions. In this environment, if you suspect fwrite(), invest in a quick test using write() with large blocks.

Today, I found a 4x speed improvement moving to write(). This was due to a posix layer in the embedded OS that transcribed fwrite()s into fputc()s... a SYNC'd underlying flash file just trashes in this case. write() was implemented by routines far closer to the OS (Nucleus) in which the block writes were not broken into bytes.

just saying... if you question the two variants, probably best to just try'em out.

Just use fwrite (no need to go lower level syscalls) and do it as one chunk. The lower level syscalls will figure out how to buffer and split up that write command the best. I've never been able to beat fwrite's performance on things like this - large sequential writes.

You would probably get higher performances by using nmap(), creating room for your array (virtual address space) and THEN writting into 'memory' rather than disk.

Let the system do it for you: it will be likely to allocate as few pages as possible, something that is not going to happen with a 75 MB buffer dumped by fwrite().

In a world of restricted CPU caches, playing with huge buffers is a no-go (that's why malloc() is using nmap() for large allocations). By attaching your buffer to a file when you setup nmap() - and before filling the buffer, you will save a LOT of work to the system.

One chunk is faster. There are several reasons for that:

1) writing to HDD means also maintaining "up to date" all the additional informations in the file system (timestamp, file size, used cluster, locks, etc), so there is some overhead associated with each file access (especially write access).

2) Disk I/O is slow, and so OS usually tries to implement some caching on its side. This means that each time you use file I/O there will be additional checks if it's cached, if it should be cached, and so on.

You can find the source of fwrite in

http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofwrite.c;hb=HEAD

As you can see, this in turn calls IO_sputn, which eventually ends up in

http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/fileops.c;hb=HEAD

(specifically, _IO_new_file_xsputn). As you can see, this always goes through the stdio buffer.

So I would advise against using stdio; writing directly using write(2) will bypass this extra copy.

继续阅读：c file-io performance

Performance of fwrite and write size

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？