Constant combining in optimizing compilers
I have a header file containing a lot of small inline functions. Most of them happen to have constant data. Since these functions are performance critical, the way they handle constants becomes important. AFAIK there are two ways to refer to constants:
1) Define them in a separate source file that is later linked with the application.
2) Define the constants in-place.
I would choose the latter way because it's more maintainable. However, the it might be slower if the compiler doesn't optimize thousands of equal constants that are created by inlining.
The question:
Will the compiler combine these equal constants? In particular, which of the following methods will be utilized?
1)开发者_运维知识库 Combining of equal constants across the compilation unit.
2) Combining of equal constants across the linking module (whole program or library) 3) Combining the constants with any static constant data that happens to have the same bit pattern and fulfills the alignment requirements across the compilation unit or whole program.I use a modern compiler (GCC4.5).
I'm not an expert in assembler, thus I couldn't answer this question myself using several simple tests :)
EDIT:
The constants are quite big (most of them at least 16 bytes), so the compiler can't make them immediate values.
EDIT2:
EXAMPLE of the code
This one uses the constant in-place:
float_4 sign(float_4 a)
{
const __attribute__((aligned(16))) float mask[4] = { //I use a macro for this line
0x80000000, 0x80000000, 0x80000000, 0x80000000};
const int128 mask = load(mask);
return b_and(a, mask);
}
According to the GCC the following option does what you want:
-fmerge-constants
Attempt to merge identical constants (string constants and floating point constants) across compilation units. This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.
Enabled at levels -O, -O2, -O3, -Os.
If you define constants in your header file like this:
int const TEN = 10;
// or
enum { ELEVEN = 11 };
That is, not only the constant declaration but the definition as well is visible to the compiler when compiling a translation unit (a .cc source file), then certainly the compiler replaces it with a constant value in the generated code even with no optimizations enabled.
[max@truth test]$ cat test.cc
int const TEN = 10; // definition available
extern int const TWELVE; // only declaration
int foo(int x) { return x + TEN; }
int bar(int x) { return x + TWELVE; }
[max@truth test]$ g++ -S -o - test.cc | c++filt | egrep -v " *\."
foo(int):
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $10, %eax
leave
ret
bar(int):
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl TWELVE(%rip), %eax
addl -4(%rbp), %eax
leave
ret
TEN:
Notice how in foo(int)
it does the addition as addl $10, %eax
, i.e. TEN constant is replaced with its value. In bar(int)
, on the other hand, it first does movl TWELVE(%rip), %eax
to load the value of TWELVE from memory into eax register (the address will be resolved by the linker) and then does the addition addl -4(%rbp), %eax
.
An optimized version looks like this:
[max@truth test]$ g++ -O3 -S -o - test.cc | c++filt | egrep -v " *\."
foo(int):
leal 10(%rdi), %eax
ret
bar(int):
movl TWELVE(%rip), %eax
addl %edi, %eax
ret
I don't think that there are general answers to your questions. I give one for C only, the rules for C++ are different.
This depends a lot on the types of your constants. An important class are "integer constant expressions". These can be determined at compile time and in particular be used as values of "integer enumeration constants". Use that whenever you may
enum { myFavoriteDimension = 55/2 };
For such constants the best thing should usually happen: they are realized as assembler immediates. They don't even have a storage location, are directly written into the assembler and your questions don't even make sense.
For other data types the question is more delicate. Try to enforce that no address of your "const qualified variables" is taken. This can be done with the register
keyword.
register double const something = 5.7;
may have the same effect as above.
For composed types (struct
, union
, arrays) there is no general answer or method. I have already seen that gcc is able to optimize small arrays (10 elements or so) completely.
精彩评论