How to replace alloca in an implementation of execvp()?
Take a look at the NetBSD implementation of execvp
here:
http://cvsweb.netbsd.se/cgi-bin/bsdweb.cgi/src/lib/libc/gen/execvp.c?rev=1.30.16.2;content-type=text%2Fplain
Note the comment at line 130, in the special case for handling ENOEXEC
:
/*
* we can't use malloc here because, if we are doing
* vfork+exec, it leaks memory in the parent.
*/
if ((memp = alloca((cnt + 2) * sizeof(*memp))) == NULL)
goto done;
memp[0] = _PATH_BSHELL;
memp[1] = bp;
(void)memcpy(&memp[2], &argv[1], cnt * sizeof(*memp));
(void)execve(_PATH_BSHELL, __UNCONST(memp), environ);
goto done;
I am trying to port this implementation of execvp
to standalone C++. alloca
is nonstandard so I want to avoid it. (Actually the function I want is execvpe
from FreeBSD, but this demonstrates the problem more clearly.)
I think I understand why it would leak memory if plain malloc
was used - while the caller of execvp
can execute code in the parent, the inner call to 开发者_C百科execve
never returns so the function cannot free the memp
pointer, and there's no way to get the pointer back to the caller. However, I can't think of a way to replace alloca
- it seems to be necessary magic to avoid this memory leak. I have heard that C99 provides variable length arrays, which I cannot use sadly as the eventual target is C++.
Is it possible to replace this use of alloca
? If it's mandated to stay within C++/POSIX, is there an inevitable memory leak when using this algorithm?
Edit: As Michael has pointed out in the comments, what is written below really won't work in the real-world due to stack-relative addressing by an optimizing compiler. Therefore a production-level alloca
needs the help of the compiler to actually "work". But hopefully the code below could give some ideas about what's happening under the hood, and how a function like alloca
might have worked if there were no stack-relative addressing optimizations to worry about.
BTW, just in case you were stil curious about how you could make a simple version of alloca
for yourself, since that function basically returns a pointer to allocated space on the stack, you can write a function in assembly that can properly manipulate the stack, and return a pointer you can use in the current scope of the caller (once the caller returns, the stack space pointer from this version of alloca
is invalidated since the return from the caller cleans up the stack).
Assuming you're using some flavor of Linux on a x86_64 platform using the Unix 64-bit ABI, place the following inside a file called "my_alloca.s":
.section .text
.global my_alloca
my_alloca:
movq (%rsp), %r11 # save the return address in temp register
subq %rdi, %rsp # allocate space on stack from first argument
movq $0x10, %rax
negq %rax
andq %rax, %rsp # align the stack to 16-byte boundary
movq %rsp, %rax # save address in return register
pushq %r11 # push return address on stack
ret # return back to caller
Then inside your C/C++ code module (i.e, your ".cpp" files), you can use it the following way:
extern my_alloca(unsigned int size);
void function()
{
void* stack_allocation = my_alloca(BUFFERSIZE);
//...do something with the allocated space
return; //WARNING: stack_allocation will be invalid after return
}
You can compile "my_alloca.s" using gcc -c my_alloca.s
. This will give you a file named "my_alloca.o" that you can then use to link with your other object files using gcc -o
or using ld
.
The main "gotcha" that I could think of with this implementation is that you could crash or end up with undefined behavior if the compiler did not work by allocating space on the stack using an activation record and a stack base-pointer (i.e., the RBP
pointer in x86_64), but rather explicitly allocated memory for each function call. Then, since the compiler won't be aware of the memory we've allocated on the stack, when it cleans up the stack at the return of the caller and tries to jump back using what it believes is the caller's return address that was pushed on the stack at the beginning of the function call, it will jump to an instruction pointer that's pointing to no-wheres-ville and you'll most likely crash with a bus error or some type of access error since you'll be trying to execute code in a memory location you're not allowed to.
There's actually other dangerous things that could happen, such as if the compiler used stack-space to allocate the arguments (it shouldn't for this function per the Unix 64-bit ABI since there's only a single argument), as that would again cause a stack clean-up right after the function call, messing up the validity of the pointer. But with a function like execvp()
, which won't return unless there's an error, this shouldn't be so much of an issue.
All-in-all, a function like this will be platform-dependent.
You can replace the call to alloca
with a call to malloc
made before the call to vfork
. After the vfork
returns in the caller the memory can be deleted. (This is safe because vfork
will not return until exec
has been called and the new program started.) The caller can then free the memory it allocated with malloc.
This doesn't leak memory in the child because the exec
call completely replaces the child image with the image of the parent process, implicitly releasing the memory that the forked process was holding.
Another possible solution is to switch to fork
instead of vfork
. This will require a little extra code in the caller because fork
returns before the exec
call is complete so the caller will need to wait for it. But once forked
the new process could use malloc
safely. My understanding of vfork
is it was basically a poor man's fork
because fork
was expensive in the days before kernels had copy-on-write pages. Modern kernels implement fork
very efficiently and there's no need resort to the somewhat dangerous vfork
.
精彩评论