does gcc link order affect speed of the program execution
I know the linkage order in gcc is important for symbols to be correctly determined; but now I am seeing a weird speed issue on the resultin开发者_运维技巧g executable. I am linking objects and archieves as
g++ -m32 a.o b.o ar1.a ar2.a -lm -lpthread -lcrypt -lz -pthread -o afast.out
vs
g++ -m32 a.o ar1.a b.o ar2.a -lm -lpthread -lcrypt -lz -pthread -o aslow.out
The second version runs 2x slower. b.o is actually in the ar1.a archieve, but ar2.o has references to it, thus linker complains, thus I had to put the b.o. In the beginning, I was putting b.o all the way to the end of the linkage to make the correct dependency order, though then figured out it even works at beginning, and even faster.
Has anyone experienced this? Is object file linkage order different than archieve order? How can there be any speed impact?
getting similar results with gcc3.4.6 or gcc4.1.2
There could be significant differences in execution speed depending on how the object code is laid out in memory. In general, you want hot functions to be close together, so they are not mixed up with cold functions, and so your Icache
and TLB
are not polluted by cold functions. It is however very unlikely that you are affected by this.
Most likely, you have some symbols that are resolved one way in the "fast" executable, and another way in the "slow" executable. The order of archive libraries and object files on command line matters, and you can end up pulling some object from ar1.a
in the "fast" link, whereas you'll pull an equivalent object from ar2.a
in the "slow" link. Perhaps there is some un-optimized code in ar2.a
?
Running nm -A ar1.a ar2.a
and checking to see if there are any symbols that occur in both would be the first step. You can then ask the linker to produce a link map (with -Wl,-M,map.out
) and check where these symbols are actually coming from in the two links.
精彩评论