Why is orig_eax provided in addition to eax?
Why is the orig_eax
member included in sys/user.h
's struct user_regs开发者_运维技巧_struct
?
Because it was in struct pt_regs
, which is .... http://tomoyo.sourceforge.jp/cgi-bin/lxr/source/arch/x86/include/asm/user_32.h#L77
73 * is still the layout used by user mode (the new
74 * pt_regs doesn't have all registers as the kernel
75 * doesn't use the extra segment registers)
So, a lot of user-space utilities expect an orig_eax
field here, so it is included in user_regs_struct
too (to be compatible with older debuggers and ptrace
rs)
Next question is "Why is the orig_eax
member included in struct pt_regs
?".
It was added in linux 0.95 http://lxr.linux.no/#linux-old+v0.95/include/sys/ptrace.h#L44.
I suggest this was done after some other unix with pt_regs
struct. Comment in 0.95 says
29 * this struct defines the way the registers are stored on the
30 * stack during a system call.
So, the place of orig_eax
is defined by syscall interface. Here it is http://lxr.linux.no/#linux-old+v0.95/kernel/sys_call.s
17 * Stack layout in 'ret_from_system_call':
18 * ptrace needs to have all regs on the stack.
19 * if the order here is changed, it needs to be
20 * updated in fork.c:copy_process, signal.c:do_signal,
21 * ptrace.c ptrace.h
22 *
23 * 0(%esp) - %ebx
...
29 * 18(%esp) - %eax
...
34 * 2C(%esp) - orig_eax
Why do we need to save old eax
twice? Because eax
will be used for the return value of syscall (same file, a bit below):
96_system_call:
97 cld
98 pushl %eax # save orig_eax
99 push %gs
...
102 push %ds
103 pushl %eax # save eax. The return value will be put here.
104 pushl %ebp
...
117 call _sys_call_table(,%eax,4)
Ptrace needs to be able to read both all registers state before syscall and the return value of syscall; but the return value is written to %eax
. Then original eax
, used before syscall will be lost. To save it, there is a orig_eax
field.
UPDATE: Thanks to R.. and great LXR, I did a full search of orig_eax
in linux 0.95.
It is used not only in ptrace, but also in do_signal when restarting a syscall (if there is a syscall, ended with ERESTARTSYS
)
158 *(&eax) = orig_eax;
UPDATE2: Linus said something interesting about it:
It's important that ORIG_EAX be set to some value that is not a valid system call number, so that the system call restart logic (see the signal handling code) doesn't trigger.
UPDATE3: ptrace
r app (debugger) can change orig_eax
to change system call number to be called: http://lkml.org/lkml/1999/10/30/82 (in some versions of kernel, is was EIO to change in ptrace an ORIG_EAX)
精彩评论