Why does fopen/fgets use both mmap and read system calls to access the data?
I have a small example program which simply fopen
s a file and uses fgets
to read it. Using strace
, I notice that the first call to fgets
runs a mmap
system call, and then read system calls are used to actually read the contents of the file. on fclose
, the file is munmap
ed. If I instead open read the file with open/read directly, this obviously does not occur. I'm curious as to what is the purpose of this mmap
is, and what it is accomplishing.
On my Linux 2.6.31 based system, when under heavy virtual memory demand these mmap
s will sometimes hang for several seconds, and appear to me to be unnecessary.
The example code:
#include <stdlib.h>
#include <stdio.h>
int main ()
{
FILE *f;
开发者_StackOverflow中文版if ( NULL == ( f=fopen( "foo.txt","r" )))
{
printf ("Fail to open\n");
}
char buf[256];
fgets(buf,256,f);
fclose(f);
}
And here is the relevant strace output when the above code is run:
open("foo.txt", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=9, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb8039000
read(3, "foo\nbar\n\n"..., 4096) = 9
close(3) = 0
munmap(0xb8039000, 4096) = 0
It's not the file that is mmap
'ed - in this case mmap
is used anonymously (not on a file), probably to allocate memory for the buffer that the consequent reads will use.
malloc
in fact results in such a call to mmap
. Similarly, the munmap
corresponds to a call to free
.
The mmap
is not mapping the file; instead it's allocating memory for the stdio FILE
buffering. Normally malloc
would not use mmap
to service such a small allocation, but it seems glibc's stdio implementation is using mmap
directly to get the buffer. This is probably to ensure it's page-aligned (though posix_memalign
could achieve the same thing) and/or to make sure closing the file returns the buffer memory to the kernel. I question the usefulness of page-aligning the buffer. Presumably it's for performance, but I can't see any way it would help unless the file offset you're reading from is also page-aligned, and even then it seems like a dubious micro-optimization.
from what i have read memory mapping functions are useful while handling large files. now the definition of large is something i have no idea about. but yes for the large files they are significantly faster as compared to the 'buffered' i/o calls.
in the example that you have posted i think the file is opened by the open()
function and mmap is used for allocating memory or something else.
from the syntax of mmap function this can be seen clearly:
void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);
the second last parameter takes the file descriptor which should be non-negative.
while in the stack trace it is -1
Source code of fopen in glibc shows that mmap can be actually used.
https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofopen.c;h=965d21cd978f3acb25ca23152993d9cac9f120e3;hb=HEAD#l36
精彩评论