How can adding a function call cause other symbols to become undefined when linking?
I'm hoping someone will be able to help troubleshoot what I think is a linker script issue.
I'm encountering a strange problem after adding a call to a new function. Without the function call, my object files link correctly, however, with the new function call added, I get an undefined reference to a symbol from another object file (I've verified it is actually present using objdump).
Also strangely, with the function call present, if I link all object files first using ld -r (to give a relocatable output) and then using my link script, there are no undefined references, but it seems the link script is being ignored since the output binary does not have the correct entry point.
My (cross-compiler) ld version:
> i586-elf-ld --version
GNU ld (GNU Binutils) 2.20.1.20100303
My attempts at proving that the 'missing' symbol is present:
> i586-elf-ld -T link.ld -o kernel32.bin kernel_loader.o main.o stdio.o common.o gdt.o gdt.bin -y putch
main.o: reference to putch stdio.o: definition of putch main.o: In function `main': main.c:(.text+0x1f): undefined reference to `putch'
N.B. (when I produced this output, I was using a filename of gdt.bin for nasm compiled assembler, it is just another .o file, really)
I can see the symbol that is 'missing' in the appropriate object file:
> i586-elf-objdump -ht stdio.o
stdio.o: file format elf32-i386Sections: Idx Name Size VMA LMA File off Algn 0 .text 000002f9 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 0000000c 00000000 00000000 00000330 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000008 00000000 00000000 0000033c 2**2 ALLOC 3 .comment 00000012 00000000 00000000 0000033c 2**0 CONTENTS, READONLY SYMBOL TABLE: 00000000 l df *ABS* 00000000 stdio.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l d .comment 00000000 .comment 00000000 g F .text 00000016 strlen 00000016 g F .text 0000005c scroll 00000008 g O .data 00000004 numrows 00000004 g O .bss 00000004 ypos 00000004 g O .data 00000004 numcols 00000004 O *COM* 00000004 screen_mem 00000000 *UND* 00000000 memcpy 00000000 *UND* 00000000 memsetw 00000072 g F .text 0000007d newline 00000000 g O .bss 00000004 xpos 000000ef g F .text 0000002e writech 00000000 g O .data 00000004 colour 0000011d g F .text 00000061 cls 0000017e g F .text 00000010 init_video 0000018e g F .text 00000133 putch 000002c1 g F .text 00000037 puts 000002f8 g F .text 00000001 set_text_colour
And the object file with unresolved reference:
> i586-elf-objdump -ht main.o
main.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 0000007f 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 000000b4 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 000000b4 2**2 ALLOC 3 .rodata.str1.1 00000024 00000000 00000000 000000b4 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .comment 00000012 00000000 00000000 000000d8 2**0 CONTENTS, READONLY SYMBOL TABLE: 00000000 l df *ABS* 00000000 main.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l d .rodata.str1.1 00000000 .rodata.str1.1 00000000 l d .comment 00000000 .comment 00000000 g F .text 0000007f main 00000000 *UND* 00000000 init_video 00000000 *UND* 00000000 gdt_install 00000000 *UND* 00000000 putch 00000000 *UND* 00000000 puts 00000018 O *COM* 00000001 gdt 00000006 O *COM* 00000001 gdtp
My link script (not sure if it's going to be relevant):
OUTPUT_FORMAT("binary")
ENTRY(start)
phys = 0x00100000;
SECTIONS
{
.text phys : AT(phys) {
code = .;
*(.text)
*(.rodata*)
. = ALIGN(4096);
}
.data . : AT(data)
{
data = .;
*(.data)
. = ALIGN(4096);
}
.bss . : AT(bss)
{
bss = .;
*(.bss)
. = ALIGN(4096);
}
end = .;
}
If I comment out the call to putch in main.c, I instead get undefined references to puts... if I remove the call to gdt_install, no errors!
gdt_install is in the C file, but gdt_install calls a function which is defined in gdt.asm.
void gdt_install() {
/* ... */
gdt_reset();
}
[bits 32]
[section .text]
global gdt_reset
extern gdtp
gdt_reset:
lgdt [gdtp]
mov ax, 0x10 ; 0x10 offset for data segment (sizeof(struct gdt_entry) * 2)
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
jmp 0x08:gdt_reset2 ; 0x08 offset for code segment (sizeof(struct gdt_entry))
gdt_reset2:
ret ; ret back to C
To try and further diagnose the cause, I've been playing around trying to recreate the errors. If I move the gdt_install() function call to a specific place in the source code, I don't receive any errors and everything works fine:
int main() {
init_video();
putch('A');
puts("<- print a single char there...\n");
gdt_install();
puts("asdf\n\n");
int i;
for (i = 0; i < 15; ++i) {
if (i % 2 == 0) {
puts("even\n");
} else {
puts("odd\n");
}
}
return 0;
}
If I move the call above the first puts() call, I receive undefined references for puts!:
...
init_video();
putch('A');
gdt_install();
puts("<- print a single char there...\n");
puts("asdf\n\n");
...
i586-elf-ld -T link.ld -o kernel32.bin kernel_loader.o main.o stdio.o common.o gdt.o gdt_asm.o
main.o: In function `main':
main.c:(.text+0x2b): undefined reference to `puts'
main.c:(.text+0x37): undefined reference to `puts'
main.c:(.text+0x51): undefined reference to `puts'
main.c:(.text+0x63): undefined reference to `puts'
Next, if I move the call above putch(), it causes a undefined reference to putch (which was where I originally had the call):
...
init_video();
gdt_install();
putch('A');
puts("<- print a single char there...\n");
puts("asdf\n\n");
...
main.o: In function `main':
main.c:(.text+0x1f): undefined reference to `putch'
And finally, above init_video(), causes a undefined reference to init_video:
...
gdt_install();
init_video();
putch('A');
puts("<- print a single char there...\n");
puts("asdf\n\n");
...
main.o: In function `main':
main.c:(.text+0x15): undefined reference to `init_video'
What on earth is causing this error? It's like the gdt_install call is somehow "corrupting" other symbols... I couldn't find any reference to it in any docs, but is there some way that the gdt_install function call could cause some linker "boundary" to be overrun, corrupting other code?
Has anyone encountered a problem like this, or have any ideas as to further investigation? I've posted on the osdev forum: http://forum.osdev.org/viewtopic.php?f=1&t=22227 but haven't had much luck.
Thanks
Edit:
I'm not sure if it's relevant, but if I omit the link script when linking, all previous errors disappear... (although, then my bootloader cannot call the kernel since it doesn't understand elf binaries).
As requested, here's the main.c file before and after pre-processing and disassembled from the compiled main.o file.
before pre-processing:
#include <stdio.h>
#include <common.h>
#include <gdt.h>
int main() {
init_video();
putch('A');
gdt_install();
puts("<- print a single char there...\n");
puts("asdf\n\n");
int i;
for (i = 0; i < 15; ++i) {
if (i % 2 == 0) {
puts("even\n");
} else {
puts("odd\n");
}
}
return 0;
}
After pre-processing:
i586-elf-gcc -Wall -O -fstrength-reduce -fomit-frame-pointer -fno-inline -nostdinc -nostdlib -fsigned-char -nostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -E main.c
# 1 "main.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "main.c"
# 1 "./include/stdio.h" 1
# 1 "./include/common.h" 1
typedef unsigned short ushort;
typedef unsigned char uchar;
typedef unsigned int uint;
typedef unsigned long ulong;
typedef int size_t;
void *memcpy(void *dst, const void *src, size_t n);
void *memset(void *dst, const char val, size_t n);
void *memsetw(void *dst, const ushort val, size_t n);
void *memseti(void *dst, const int val, size_t n);
# 5 "./include/stdio.h" 2
void cls();
void writech(char c);
void putch(char c);
void puts(char *str);
void set_text_colour(uchar f, uchar b);
void init_video();
size_t strlen(char *str);
# 2 "main.c" 2
# 1 "./include/gdt.h" 1
struct gdt_entry {
ushort limit_low;
ushort base_low;
uchar base_middle;
uchar access;
uchar granularity;
uchar base_high;
} __attribute__((packed));
struct gdt_ptr {
ushort limit;
uint base;
} __attribute__((packed));
void gdt_set_gate(int n, ulong base, ulong limit, uchar access, uchar gran);
void gdt_install();
extern void gdt_reset();
# 4 "main.c" 2
int main() {
init_video();
putch('A');
gdt_install();
puts("<- print a single char there...\n");
puts("asdf\n\n");
int i;
for (i = 0; i < 15; ++i) {
if (i % 2 == 0) {
puts("even\n");
} else {
puts("odd\n");
}
}
return 0;
}
Edit, again:
Thanks to nategoose for suggesting -g3 to give nicer disassembly output:
main.o: file format elf32-i386
SYMBOL TABLE:
00000000 l df *ABS* 00000000 main.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .rodata.str1.4 00000000 .rodata.str1.4
00000000 l d .rodata.str1.1 00000000 .rodata.str1.1
00000000 l d .stab 00000000 .stab
00000000 l d .stabstr 00000000 .stabstr
00000000 l d .comment 00000000 .comment
00000000 g F .text 0000007f main
00000000 *UND* 00000000 init_video
00000000 *UND* 00000000 putch
00000000 *UND* 00000000 gdt_install
00000000 *UND* 00000000 puts
Disassembly of section .text:
00000000 <main>:
#include <stdio.h>
#include <common.h>
#include <gdt.h>
int main() {
0: 8d 4c 24 04 lea 0x4(%esp),%ecx
4: 83 e4 f0 and $0xfffffff0,%esp
7: ff 71 fc pushl -0x4(%ecx)
a: 55 push %ebp
b: 89 e5 mov %esp,%ebp
d: 53 push %ebx
e: 51 push %ecx
init_video();
f: e8 fc ff ff ff call 10 <main+0x10>
putch('A');
14: 83 ec 0c sub $0xc,%esp
17: 6a 41 push $0x41
19: e8 fc ff ff ff call 1a <main+0x1a>
gdt_install();
1e: e8 fc ff ff ff call 1f <main+0x1f>
puts("<- print a single char there...\n");
23: c7 04 24 00 00 00 00 movl $0x0,(%esp)
2a: e8 fc ff ff ff call 2b <main+0x2b>
puts("asdf\n\n");
2f: c7 04 24 00 00 00 00 movl $0x0,(%esp)
36: e8 fc ff ff ff call 37 <main+0x37>
3b: 83 c4 10 add $0x10,%esp
int i;
for (i = 0; i < 15; ++i) {
3e: bb 00 00 00 00 mov $0x0,%ebx
if (i % 2 == 0) {
43: f6 c3 01 test $0x1,%bl
46: 75 12 jne 5a <main+0x5a>
puts("even\n");
48: 83 ec 0c sub $0xc,%esp
4b: 68 07 00 00 00 push $0x7
50: e8 fc ff ff ff call 51 <main+0x51>
55: 83 c4 10 add $0x10,%esp
58: eb 10 jmp 6a <main+0x6a>
} else {
puts("odd\n");
5a: 83 ec 0c sub $0xc,%esp
5d: 68 0d 00 00 00 push $0xd
62: e8 fc ff ff ff call 63 <main+0x63>
67: 83 c4 10 add $0x10,%esp
puts("asdf\n\n");
int i;
for (i = 0; i < 15; ++i) {
6a: 43 inc %ebx
6b: 83 fb 0f cmp $0xf,%ebx
6e: 75 d3 jne 43 <main+0x43>
puts("odd\n");
}
}
return 0;
}
70: b8 00 00 00 00 mov $0x0,%eax
75: 8d 65 f8 lea -0x8(%ebp),%esp
78: 59 pop %ecx
79: 5b pop %ebx
7a: 5d pop %ebp
7b: 8d 61 fc lea -0x4(%ecx),%esp
7e: c3 ret
And now the new output from a clean make:
$ make
nasm -f elf kernel_loader.asm -o kernel_loader.o
i586-elf-gcc -Wall -O0 -fstrength-reduce -fomit-frame-pointer -fno-inline -nostdinc -nostdlib -fsigned-char -nostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -g3 -c main.c
i586-elf-gcc -Wall -O0 -fstrength-reduce -fomit-frame-pointer -fno-inline -nostdinc -nostdlib -fsigned-char -n开发者_StackOverflowostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -g3 -c stdio.c
i586-elf-gcc -Wall -O0 -fstrength-reduce -fomit-frame-pointer -fno-inline -nostdinc -nostdlib -fsigned-char -nostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -g3 -c common.c
i586-elf-gcc -Wall -O0 -fstrength-reduce -fomit-frame-pointer -fno-inline -nostdinc -nostdlib -fsigned-char -nostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -g3 -c gdt.c
nasm -f elf gdt.asm -o gdt_asm.o
i586-elf-ld -T link.ld -o kernel32.bin -\( kernel_loader.o main.o stdio.o common.o gdt.o gdt_asm.o -\)
main.o: In function `main':
/cygdrive/c/programming/os/kernel/main.c:12: undefined reference to `puts'
/cygdrive/c/programming/os/kernel/main.c:14: undefined reference to `puts'
/cygdrive/c/programming/os/kernel/main.c:20: undefined reference to `puts'
/cygdrive/c/programming/os/kernel/main.c:22: undefined reference to `puts'
make: *** [kernel32.bin] Error 1
Edit 3
As requested, here's the output of nm -s on stdio.o
i586-elf-nm -s stdio.o
00000042 T cls
00000000 D colour
00000000 T init_video
U memcpy
U memsetw
0000015e T newline
00000004 D numcols
00000008 D numrows
000001e4 T putch
0000024e T puts
00000004 C screen_mem
000000b8 T scroll
00000291 T set_text_colour
00000016 T strlen
00000199 T writech
00000000 B xpos
00000004 B ypos
Edit 4 As requested, here are the entire source files. I've uploaded the files in a zip to: http://www.owenstephens.co.uk/media/files/kernel.zip Thanks for the continued interest and help, it's much appreciated!
Makefile:
NASM=nasm
GCC=i586-elf-gcc
LD=i586-elf-ld
FMT=-f elf
GFLAGS=-Wall -O0 -fstrength-reduce -fno-inline -nostdinc -nostdlib -fsigned-char -nostartfiles -nodefaultlibs -fno-builtin -fno-stack-protector -I./include -g3 -c
LFLAGS=-T link.ld
ALL=kernel_loader.o main.o stdio.o common.o gdt.o gdt_asm.o
INCLUDES=include/stdio.h include/common.h include/gdt.h
all: $(ALL) kernel32.bin
kernel_loader.o: kernel_loader.asm
$(NASM) $(FMT) $*.asm -o $@
main.o: main.c
$(GCC) $(GFLAGS) $<
stdio.o: stdio.c include/stdio.h
$(GCC) $(GFLAGS) $<
common.o: common.c include/common.h
$(GCC) $(GFLAGS) $<
gdt.o: gdt.c include/gdt.h
$(GCC) $(GFLAGS) $<
gdt_asm.o: gdt.asm
$(NASM) $(FMT) $< -o $@
kernel32.bin: $(ALL) $(INCLUDES)
$(LD) $(LFLAGS) -o $@ -\( $(ALL) -\)
clean:
rm -f $(ALL) kernel32.bin
Link script:
OUTPUT_FORMAT("binary")
ENTRY(_start)
phys = 0x00100000;
SECTIONS
{
.text phys : AT(phys) {
code = .;
*(.text)
*(.rodata*)
. = ALIGN(4096);
}
.data . : AT(data)
{
data = .;
*(.data)
. = ALIGN(4096);
}
.bss . : AT(bss)
{
bss = .;
*(.bss)
. = ALIGN(4096);
}
end = .;
}
main.c:
#include <stdio.h>
#include <common.h>
#include <gdt.h>
int main() {
gdt_install();
puts("This is a minimal example...");
return 0;
}
common.c:
#include <common.h>
void *memcpy(void *dst, const void *src, size_t n) { return (void *)0; }
void *memset(void *dst, const char val, size_t n) { return (void *)0; }
void *memsetw(void *dst, const ushort val, size_t n) { return (void *)0; }
void *memseti(void *dst, const int val, size_t n) { return (void *)0; }
stdio.c:
#include <stdio.h>
#include <common.h>
ushort *screen_mem;
int colour = 0x0F;
int xpos = 0, ypos = 0;
int numcols = 80, numrows = 25;
void init_video() {}
size_t strlen(char *str) { return 0; }
void cls() { }
inline void scroll() { }
inline void newline() { }
void writech(char c) { }
void putch(char c) { }
void puts(char *str) { }
void set_text_colour(unsigned char f, unsigned char b){ }
gdt.c:
#include <gdt.h>
struct gdt_entry gdt[3];
struct gdt_ptr gdtp;
void gdt_set_gate(int n, ulong base, ulong limit, uchar access, uchar gran) { }
void gdt_install() { }
gdt.asm:
global gdt_reset
gdt_reset:
ret
gdt_reset2:
ret
include/common.h:
#ifndef __COMMON_H
#define __COMMON_H
typedef unsigned short ushort;
typedef unsigned char uchar;
typedef unsigned int uint;
typedef unsigned long ulong;
typedef int size_t;
void *memcpy(void *dst, const void *src, size_t n);
void *memset(void *dst, const char val, size_t n);
void *memsetw(void *dst, const ushort val, size_t n);
void *memseti(void *dst, const int val, size_t n);
#endif
include/stdio.h:
#ifndef __STDIO_H
#define __STDIO_H
#include <common.h>
void cls();
void writech(char c);
void putch(char c);
void puts(char *str);
void set_text_colour(uchar f, uchar b);
void init_video();
size_t strlen(char *str);
#endif
include/gdt.h:
#ifndef __GDT_H
#define __GDT_H
#include <common.h>
struct gdt_entry {
ushort limit_low;
ushort base_low;
uchar base_middle;
uchar access;
uchar granularity;
uchar base_high;
} __attribute__((packed));
struct gdt_ptr {
ushort limit;
uint base;
} __attribute__((packed));
void gdt_set_gate(int n, ulong base, ulong limit, uchar access, uchar gran);
void gdt_install();
extern void gdt_reset();
#endif
Output of "objdump -t"ing a shared lib that includes all .o files (except the kernel_loader, hence the undefined _start symbol.
> i586-elf-objdump -t libos.so.1.0.1
libos.so.1.0.1: file format elf32-i386
SYMBOL TABLE:
08048080 l d .text 00000000 .text
08048162 l d .rodata 00000000 .rodata
08049180 l d .data 00000000 .data
0804918c l d .bss 00000000 .bss
00000000 l d .stab 00000000 .stab
00000000 l d .stabstr 00000000 .stabstr
00000000 l d .comment 00000000 .comment
00000000 l df *ABS* 00000000 main.c
00000000 l df *ABS* 00000000 stdio.c
00000000 l df *ABS* 00000000 common.c
00000000 l df *ABS* 00000000 gdt.c
00000000 l df *ABS* 00000000 gdt.asm
08048161 l .text 00000000 gdt_reset2
08049180 g O .data 00000004 colour
08048125 g F .text 00000014 memsetw
0804918c g O .bss 00000004 xpos
08049188 g O .data 00000004 numrows
08048158 g F .text 00000005 gdt_install
08048108 g F .text 0000000a memcpy
080480ee g F .text 00000005 puts
08049198 g O .bss 00000018 gdt
08049194 g O .bss 00000004 screen_mem
080480e0 g F .text 0000000e putch
08048144 g F .text 00000014 gdt_set_gate
00000000 *UND* 00000000 _start
08048160 g .text 00000000 gdt_reset
080480b4 g F .text 00000005 init_video
080480c8 g F .text 00000005 scroll
0804918c g *ABS* 00000000 __bss_start
08048112 g F .text 00000013 memset
08048080 g F .text 00000033 main
080480f3 g F .text 00000014 set_text_colour
080480cd g F .text 00000005 newline
08049190 g O .bss 00000004 ypos
080491b0 g O .bss 00000006 gdtp
0804918c g *ABS* 00000000 _edata
080491b8 g *ABS* 00000000 _end
080480c3 g F .text 00000005 cls
080480b9 g F .text 0000000a strlen
08048139 g F .text 0000000a memseti
08049184 g O .data 00000004 numcols
080480d2 g F .text 0000000e writech
This sounds like a circular reference problem on the link line. The linker goes through the object files in order and "remembers" any unresolved externals. However, it also can discard any object files that don't have references to them. If two or more object files reference each other (causing a circular reference), the linker may not be able to track the unresolved entities.
Try duplicating portions of the link line, and then narrow it down to what you need.
i586-elf-ld -T link.ld -o kernel32.bin kernel_loader.o main.o stdio.o common.o gdt.o gdt_asm.o \
stdio.o common.o gdt.o gdt_asm.o
The order of files on the command line seems to matter with the GNU linker. Put the .o
file containing the entry point (kernel_loader.o
) first on the command line, then any objects that it directly references, then the objects referenced by those objects (and that aren't already on the cmd-line), etc., or it's possible that the linker will miss some files.
There is another SO question (may be similar/identical, I'm not sure) which covers some of this. Does that one offer any help?
I've seen similar problems a few times and at a certain point, just before I go stark raving mad, I start to look for invisible things that might end up in the names. Non-ASCII bytes or non-printable ASCII characters that could have snuck into your source code and attached themselves to the first code seen after gdt_install();
You might want to try adding a comment or empty macro (or a do{}while(0)
) between your call to gdt_install()
and the next real line of code. Maybe even go as far as placing your cursor in the function name and back it up to just before the first character of that function name and start typing whatever you decide to add there. If it is something caused by the presence of gdt_install();
then something else getting thrown in there should force a different error.
If you haven't already, you may want to view the preprocessor output of the file with the call to gdt_install()
as well as its assembly output.
If none of that produces anything interesting, change the name of gdt_install
and see if that changes anything. I've seen a few instances where bugs in the compiler and/or linker could produce something odd like this. Perhaps a bug in a hash table used as a symbol table (maybe even in the elf file's hash table).
Hope you figure this out.
some wild guess, perhaps your assembler function which is called (and probably inlined) in gdt_install
is messing up what is coming after it. (this jump at the end before the ret
is unusual, never seen that)
Is your gdt_install
in the same compilation unit as where you have the call? Do you use -O[12]
for compilation? Is the call inlined? How does the assembler that the compiler produces for the call side look like?
Did you try to compile with -O0 -g
or -fno-inline
(or how this option is called)?
You wrote. "I've implemented my own psuedo-stdio functions (puts, putch...) the functions match exactly ...
" Are you using anything in the standard libc? If you aren't, you should add -nostdlib
to your command line. I have seen strange things happen when I've tried to override functions in the standard libc.
I suspect that perhaps the combining of the linking symbol tables from the nasm generated object and the gcc/gas generated objects might be messing something up.
Could you try replacing the call to gtd_install
with a call to a short inline function containing inline assembly that calls or jumps to gtd_install
and resides in the same file as the current call to gtd_install
?
Another thing that just popped into my mind is that if gtd_install
is written in assembly language then it is possible that it might not be 100% syntactically correct. I've never seen anything like this, but just thinking that it might be possible that gtd_install
's boundaries (particularly the end) or its size is not correctly determined by the assembler, and that is just having random-ish results.
Baring that I think you're going to have to go to the binutils folks and ask them for help directly.
精彩评论