Global variable implementation
When I write the following program:
file 1:
#include <stdio.h>
int global;
void print_global1() {
printf("%p\n", &global);
}
开发者_StackOverflow中文版file 2:
#include <stdio.h>
char global;
void print_global2() {
printf("%p\n", &global);
}
file 3:
void print_global1();
void print_global2();
int main()
{
print_global1();
print_global2();
return 0;
}
output:
$ ./a.out
0x804a01c
0x804a01c
Here is my question:
- Why are the linker implementing "int global" and "char global" as the same global variable:
- How come the compiler does not complain (not the smallest warning with
-Wall -Wextra -ansi
...) - How are the size of the global variable managed (the size of int and char are different)
PS: The second question is architecture/compiler related, so lets take the gcc or Visual C++ (for C) with the int size as 32 bits
EDIT: THIS IS NOT A QUESTION FOR C++ BUT for C!
I use gcc version 4.4.1 and on Ubuntu 9.10, Here is the compilation console output:
$ ls
global_data1.c global_data2.c global_data.c
$ gcc -Wall -Wextra -ansi global_data*.c
$ ./a.out
0x804a01c
0x804a01c
or
$ gcc -Wall -Wextra -ansi -c global_data*.c
$ gcc -Wall -Wextra -ansi global_data*.o
$ ./a.out
0x804a01c
0x804a01c
gcc
does not report any error/warnings. But g++
does.
EDIT:
Looks like C allows tentative definitions for a variable.
In your case both the global definitions are tentative and in that case the first one seen by the linker is chosen.
Change your file2 to:
char global = 1; // no more tentative...but explicit.
Now if you compile like before, the tentative def in file1 will be ignored.
Make both the def explicit by:
int global = 1; // in file1
char global = 1; // in file2
now neither can be ignored and we get the multiple def error.
This has to do with something called "tentative definition" in C. First, if you assign to global
in both file1 and file2, you will get an error in C. This is because global
is not tentatively defined in file1 and file2 anymore, it is really defined.
From the C standard (emphasis mine):
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
For your case, "translation unit" (basically) each source file.
About "composite types":
For an identifier with internal or external linkage declared in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the type of the identifier at the later declaration becomes the composite type.
For more on tentative definitions, see this question and its answers.
It seems like for your case, it should be undefined behavior because global
is defined at the end of the translation units, so you get two definitions of global
, and what's worse, they are different. Looks like the linker by default doesn't complain about this though.
GNU ld has an option called --warn-common
, which warns you for multiple tentative definitions (common symbol is linker's name for tentatively defined variables):
$ gcc -Wl,--warn-common file*.c
/tmp/ccjuPGcq.o: warning: common of `global' overridden by larger common
/tmp/ccw6nFHi.o: warning: larger common is here
From the manual:
If there are only (one or more) common symbols for a variable, it goes in the uninitialized data area of the output file. The linker merges multiple common symbols for the same variable into a single symbol. If they are of different sizes, it picks the largest size. The linker turns a common symbol into a declaration, if there is a definition of the same variable.
The
--warn-common
option can produce five kinds of warnings. Each warning consists of a pair of lines: the first describes the symbol just encountered, and the second describes the previous symbol encountered with the same name. One or both of the two symbols will be a common symbol.
The linker allows for having duplicate external data like this (although I'm surprised that the different types don't cause a problem). Which one you get depends upon the order of your object files on your link command line.
Which compiler are you using. What is the platform? With g++ I get
/tmp/cc8Gnf4h.o:(.bss+0x0): multiple definition of `global'
/tmp/ccDQHZn2.o:(.bss+0x0): first defined here
/usr/bin/ld: Warning: size of symbol `global' changed from 4 in a.o to 1 in b.o
AFAIR, in C++ the variables in different translation units much have the exactly same declaration to work.
精彩评论