Performance of fwrite and write size
I'm writing out a large numerical 2 dimensional array to a binary file (final size ~75 MB).
I'm doing this on a linux system. First, is there a better method or syscall other than fwrite to write the file as fast as possible?
Second, if I should use fwrite, then should I just write the whole file as 1 contiguous line?
fwrite( bu开发者_StackOverflow中文版f, sizeof(float), 6700*6700, fp );
or write it as a series of chunks
fwrite( buf, sizeof(float), 8192, fp );
fwrite( *(buf+8192), sizeof(float), 8192, fp );
....
If I should chunk the writing, how big should each chunk be?
I agree with miked and Jerome for the most part, but... only for a modern OS. If you are working embedded on a flash file system, there are some major exceptions. In this environment, if you suspect fwrite(), invest in a quick test using write() with large blocks.
Today, I found a 4x speed improvement moving to write(). This was due to a posix layer in the embedded OS that transcribed fwrite()s into fputc()s... a SYNC'd underlying flash file just trashes in this case. write() was implemented by routines far closer to the OS (Nucleus) in which the block writes were not broken into bytes.
just saying... if you question the two variants, probably best to just try'em out.
Just use fwrite (no need to go lower level syscalls) and do it as one chunk. The lower level syscalls will figure out how to buffer and split up that write command the best. I've never been able to beat fwrite's performance on things like this - large sequential writes.
You would probably get higher performances by using nmap(), creating room for your array (virtual address space) and THEN writting into 'memory' rather than disk.
Let the system do it for you: it will be likely to allocate as few pages as possible, something that is not going to happen with a 75 MB buffer dumped by fwrite().
In a world of restricted CPU caches, playing with huge buffers is a no-go (that's why malloc() is using nmap() for large allocations). By attaching your buffer to a file when you setup nmap() - and before filling the buffer, you will save a LOT of work to the system.
One chunk is faster. There are several reasons for that:
1) writing to HDD means also maintaining "up to date" all the additional informations in the file system (timestamp, file size, used cluster, locks, etc), so there is some overhead associated with each file access (especially write access).
2) Disk I/O is slow, and so OS usually tries to implement some caching on its side. This means that each time you use file I/O there will be additional checks if it's cached, if it should be cached, and so on.
You can find the source of fwrite in
http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofwrite.c;hb=HEAD
As you can see, this in turn calls IO_sputn, which eventually ends up in
http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/fileops.c;hb=HEAD
(specifically, _IO_new_file_xsputn). As you can see, this always goes through the stdio buffer.
So I would advise against using stdio; writing directly using write(2) will bypass this extra copy.
精彩评论