On Wed, 4 Nov 1998, Georg Nikodym wrote:
When a UNIX process wants to create a child, it fork()s and exec()s.
The fork() duplicates the parent's address space. This was seen as
silly when 9 times out of 10 the child simply calls exec and thus was
born vfork(). vfork() has this problem, though that the child briefly
runs in the parents address space. The implications of this statement
are serious and have resulted in two things. First, standards authors
have had no choice but to say that you can't guarantee anything aside
from exec() or _exit() will work. Second, application developers have
been bitten by all kinds of nasty bugs and have shied away from vfork.
The net result is that vfork() could probably be removed from the UNIX
ABI and nobody would shed a tear.
I don't believe this is the case. While fork() does duplicate the address
space, it does not have to physically duplicate the memory used by the
process. First, mapped files (the executable, shared libs) are read-only
and thus can be shared. Also, most modern OSes will implement
copy-on-write in this situation. The original memory image will be shared
on a page-by-page basis until one process writes to the shared page. At
that point, the page is duplicated.
So, { fork(); exec(); } is a relatively efficient operation.
The overhead that I believe that vfork is attempting to eliminate is that
of the actual management of these internal data structures when
duplicating a process. vfork allows the child to run until _exit or exec
using the kernel structures of the parent. When the _exit/exec call
"completes", the parent resumes. The exec call will create a new process.
This saves the overhead of (1) marking the shared pages as being shared at
the time of the fork only to (2) release the shared pages back to the
parent upon the subsequent exec.
Comments?
This is very system dependent, but this is how it's "supposed to" work.
-Justin
vallon(a)mindspring.com