The difference between fork(), vfork(), exec() and clone()
I was looking to find the difference between these four on Google and I expected there to be a huge amount of information on this, but there really wasn't any solid comparison between the four calls.
I set about trying to compile a kind of basic at-a-glance look at the differences between these system calls and here's what I got. Is all this information correct/am I missing anything important ?
Fork
: The fork call basically makes a duplicate of the current process, identical in almost every way (not everything is copied over, for example, resource limits in some implementations but the idea is to create as close a copy as possible).
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.
Vfork
: The basic difference between vfork()
and fork()
is that when a new process is created with vfork()
, the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve()
, at which point the parent
process continues.
This means that the child process of a vfork()
must be careful to avoid unexpectedly modifying variables of the parent process. In particular, the child process must not return from the function containing the vfork()
call, and it must not call exit()
(if it needs to exit, it should use _exit()
; actually, this is also true for the child of a normal fork()
).
Exec
: The exec call is a way to basically replace the entire current process with a new 开发者_运维知识库program. It loads the program into the current process space and runs it from the entry point. exec()
replaces the current process with a the executable pointed by the function. Control never returns to the original program unless there is an exec()
error.
Clone
: clone()
, as fork()
, creates a new process. Unlike fork()
, these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.
When the child process is created with clone()
, it executes the function application fn(arg) (This differs from fork()
, where execution continues in the child from the point of the original fork()
call.) The fn argument is a pointer to a function that is called by the child process at the beginning of its execution. The arg argument is passed to the fn function.
When the fn(arg) function application returns, the child process terminates. The integer returned by fn is the exit code for the child process. The child process may also terminate explicitly by calling exit(2)
or after receiving a fatal signal.
Information gotten from:
- Differences between fork and exec
- http://www.allinterview.com/showanswers/59616.html
- http://www.unixguide.net/unix/programming/1.1.2.shtml
- http://linux.about.com/library/cmd/blcmdl2_clone.htm
Thanks for taking the time to read this ! :)
vfork()
is an obsolete optimization. Before good memory management,fork()
made a full copy of the parent's memory, so it was pretty expensive. since in many cases afork()
was followed byexec()
, which discards the current memory map and creates a new one, it was a needless expense. Nowadays,fork()
doesn't copy the memory; it's simply set as "copy on write", sofork()
+exec()
is just as efficient asvfork()
+exec()
.clone()
is the syscall used byfork()
. with some parameters, it creates a new process, with others, it creates a thread. the difference between them is just which data structures (memory space, processor state, stack, PID, open files, etc) are shared or not.
execve()
replaces the current executable image with another one loaded from an executable file.fork()
creates a child process.vfork()
is a historical optimized version offork()
, meant to be used whenexecve()
is called directly afterfork()
. It turned out to work well in non-MMU systems (wherefork()
cannot work in an efficient manner) and whenfork()
ing processes with a huge memory footprint to run some small program (think Java'sRuntime.exec()
). POSIX has standardized theposix_spawn()
to replace these latter two more modern uses ofvfork()
.posix_spawn()
does the equivalent of afork()/execve()
, and also allows some fd juggling in between. It's supposed to replacefork()/execve()
, mainly for non-MMU platforms.pthread_create()
creates a new thread.clone()
is a Linux-specific call, which can be used to implement anything fromfork()
topthread_create()
. It gives a lot of control. Inspired onrfork()
.rfork()
is a Plan-9 specific call. It's supposed to be a generic call, allowing several degrees of sharing, between full processes and threads.
fork()
- creates a new child process, which is a complete copy of the parent process. Child and parent processes use different virtual address spaces, which is initially populated by the same memory pages. Then, as both processes are executed, the virtual address spaces begin to differ more and more, because the operating system performs a lazy copying of memory pages that are being written by either of these two processes and assigns an independent copies of the modified pages of memory for each process. This technique is called Copy-On-Write (COW).vfork()
- creates a new child process, which is a "quick" copy of the parent process. In contrast to the system callfork()
, child and parent processes share the same virtual address space. NOTE! Using the same virtual address space, both the parent and child use the same stack, the stack pointer and the instruction pointer, as in the case of the classicfork()
! To prevent unwanted interference between parent and child, which use the same stack, execution of the parent process is frozen until the child will call eitherexec()
(create a new virtual address space and a transition to a different stack) or_exit()
(termination of the process execution).vfork()
is the optimization offork()
for "fork-and-exec" model. It can be performed 4-5 times faster than thefork()
, because unlike thefork()
(even with COW kept in the mind), implementation ofvfork()
system call does not include the creation of a new address space (the allocation and setting up of new page directories).clone()
- creates a new child process. Various parameters of this system call, specify which parts of the parent process must be copied into the child process and which parts will be shared between them. As a result, this system call can be used to create all kinds of execution entities, starting from threads and finishing by completely independent processes. In fact,clone()
system call is the base which is used for the implementation ofpthread_create()
and all the family of thefork()
system calls.exec()
- resets all the memory of the process, loads and parses specified executable binary, sets up new stack and passes control to the entry point of the loaded executable. This system call never return control to the caller and serves for loading of a new program to the already existing process. This system call withfork()
system call together form a classical UNIX process management model called "fork-and-exec".
The fork(),vfork() and clone() all call the do_fork() to do the real work, but with different parameters.
asmlinkage int sys_fork(struct pt_regs regs)
{
return do_fork(SIGCHLD, regs.esp, ®s, 0);
}
asmlinkage int sys_clone(struct pt_regs regs)
{
unsigned long clone_flags;
unsigned long newsp;
clone_flags = regs.ebx;
newsp = regs.ecx;
if (!newsp)
newsp = regs.esp;
return do_fork(clone_flags, newsp, ®s, 0);
}
asmlinkage int sys_vfork(struct pt_regs regs)
{
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.esp, ®s, 0);
}
#define CLONE_VFORK 0x00004000 /* set if the parent wants the child to wake it up on mm_release */
#define CLONE_VM 0x00000100 /* set if VM shared between processes */
SIGCHLD means the child should send this signal to its father when exit.
For fork, the child and father has the independent VM page table, but since the efficiency, fork will not really copy any pages, it just set all the writeable pages to readonly for child process. So when child process want to write something on that page, an page exception happen and kernel will alloc a new page cloned from the old page with write permission. That's called "copy on write".
For vfork, the virtual memory is exactly by child and father---just because of that, father and child can't be awake concurrently since they will influence each other. So the father will sleep at the end of "do_fork()" and awake when child call exit() or execve() since then it will own new page table. Here is the code(in do_fork()) that the father sleep.
if ((clone_flags & CLONE_VFORK) && (retval > 0))
down(&sem);
return retval;
Here is the code(in mm_release() called by exit() and execve()) which awake the father.
up(tsk->p_opptr->vfork_sem);
For sys_clone(), it is more flexible since you can input any clone_flags to it. So pthread_create() call this system call with many clone_flags:
int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGNAL | CLONE_SETTLS | CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID | CLONE_SYSVSEM);
Summary: the fork(),vfork() and clone() will create child processes with different mount of sharing resource with the father process. We also can say the vfork() and clone() can create threads(actually they are processes since they have independent task_struct) since they share the VM page table with father process.
in fork(), either child or parent process will execute based on cpu selection.. But in vfork(), surely child will execute first. after child terminated, parent will execute.
精彩评论