How are variables in shared libraries referenced by loader?
I now understand how dynamic functions are referenced, by procedure linkage table like below:
Dump of assembler code for function foo@plt:
0x0000000000400528 <foo@plt+0>: jmpq *0x2004d2(%rip) # 0x600a00 <_GLOBAL_OFFSET_TABLE_+40>
0x000000000040052e <foo@plt+6>: pushq $0x2
0x0000000000400533 <foo@plt+11>: jmpq 0x4004f8
(gdb) disas 0x4004f8
No f开发者_C百科unction contains specified address.
But I don't know how dynamic variables are referenced,though I found the values are populated in the GOT once started,but there's no stub like above,how does it work?
The dynamic loader relocates all references to variables before transferring control to the user program.
There is no "stub" for them, because once the user program starts executing, it is not possible for the loader to regain control and update variable addresses. If this isn't clear to you, then you have not really understood how the PLT lazy-resolution stub works.
Global variables are accessed indirectly, via a global offset table.
- When compiling a program, the compiler generates code that performs indirect accesses, and emits relocation information specifying the entry in the global offset table being used.
- The linker performs these relocations when creating the final dynamically loadable object, resulting in machine code that does not need further patching at load time.
To see this in action, consider the following code fragment.
int v1;
int f(void) { return !v1; }
The function f
references a global v1
. The machine code generated
for the function looks like the following (on an i386):
% gcc -c -fpic a.c
% objdump --disassemble --reloc a.o
[snip]
Disassembly of section .text:
00000000 <f>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: e8 fc ff ff ff call 4 <f+0x4>
4: R_386_PC32 __i686.get_pc_thunk.cx
8: 81 c1 02 00 00 00 add $0x2,%ecx
a: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
e: 8b 81 00 00 00 00 mov 0x0(%ecx),%eax
10: R_386_GOT32 v1
14: 8b 00 mov (%eax),%eax
16: 85 c0 test %eax,%eax
18: 0f 94 c0 sete %al
1b: 0f b6 c0 movzbl %al,%eax
1e: 5d pop %ebp
1f: c3 ret
Disassembly of section .text.__i686.get_pc_thunk.cx:
00000000 <__i686.get_pc_thunk.cx>:
0: 8b 0c 24 mov (%esp),%ecx
3: c3 ret
Machine code walk-through:
- (Offsets 0x0 and 0x1) The standard function prologue.
- (Offset 0x3) The call to
__i686.get_pc_thunk.cx
prepares for PC-relative addressing by loading the address of the instruction after the call into register%ecx
. - (Offset 0x8) The value in
%ecx
is adjusted to point to the start of the global offset table. This adjustment is signalled by the relocation entry of typeR_386_GOTPC
. - (Offset 0xE) The address of global
v1
is retrieved. TheR_386_GOT32
relocation supplies the offset ofv1
's entry from the base of the global offset table. - (Offset 0x14) The value in
v1
is retrieved into register%eax
. - (Offsets 0x16--0x1F) The rest of the computation for function
f
.
In the final shared object, the linker patches the function's code to the following:
% gcc -shared -o a.so a.o
% objdump --disassemble a.so
...snip...
0000044c <f>:
44c: 55 push %ebp
44d: 89 e5 mov %esp,%ebp
44f: e8 18 00 00 00 call 46c <__i686.get_pc_thunk.cx>
454: 81 c1 a0 1b 00 00 add $0x1ba0,%ecx
45a: 8b 81 f8 ff ff ff mov -0x8(%ecx),%eax
460: 8b 00 mov (%eax),%eax
462: 85 c0 test %eax,%eax
...snip...
- Assuming that the object gets loaded at offset O in memory, the
call instruction at offset 0x44F will load O+0x454+0x1BA0, i.e.,
O+0x1FF4 into
%ecx
. - The instruction at offset 0x45A subtracts 8 from
%ecx
to get the address of the slot forv1
in the global offset table, i.e., the slot forv1
is at offset 0x1FEC from the start of the shared object.
Looking at the dynamic relocation records for the shared object, we
see a relocation record instructing the runtime loader to store the
actual address for v1
at offset 0x1FEC.
% objdump -R a.so
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
...snip...
00001fec R_386_GLOB_DAT v1
...snip...
Further reading:
- Pat Beirne's "Study of ELF loading and relocs" has more information about ELF relocations.
精彩评论