开发者

compile code as a single automaticly merged file to allow compiler better code optimization

suppose you have a program in C, C++ or any other language that employs the "compile-objects-then-link-them"-scheme.

When your program is not small, it is likely to compromise several files, in order to ease code management (and shorten compilation time). Furthermore, after a certain degree of abstraction you likely have a deep call hierarchy. Especially at the lowest level, where tasks are most repetitive, most frequent you want to impose a general framework.

However, if you fragment your code into different object files and use a very abstract archictecture for your code, it might inflict performance (which is bad if you or your supervisor emphasizes performance).

One way to circuvent this is might be extensive inlining - this is the approach of template meta-programming: in each translation unit you include all the code of your general, flexible structures, and count on the compiler to counteract performance issues. I want to do something similar without templates - say, because they are too hard to handle or because you use plain C.

You could write all your code into one single file. That would be horrible. What about writing a script, which merges all your code into one source file and compiles it? Requiring your source files are not too wildly written. Then a compiler could probably apply much more optimization (inlining, dead code elamination, compile-time ar开发者_开发技巧ithmetics, etc.).

Do you Have any experience with or objections against this "trick"?


Pointless with a modern compiler. MSVC, GCC and clang all support link-time code generation (GCC and clang call it 'link-time optimisation'), which allows for exactly this. Plus, combining multiple translation units into one large makes you unable to parallelise the compilation process, and (at least in case of C++) makes RAM usage go through the roof.

in each translation unit you include all the code of your general, flexible structures, and count on the compiler to counteract performance issues.

This is not a feature, and it's not related to performance in any way. It's an annoying limitation of compilers and the include system.


This is a semi-valid technique, iirc KDE used to use this to speed up compilation back in the day when most people had one cpu core. There are caveats though, if you decide to do something like this you need to write your code with it in mind.

Some samples of things to watch out for:

  • Anonymous namespaces - namespace { int x; }; in two source files.
  • Using-declarations that affect following code. using namespace foo; in a .cpp file can be OK - the appended sources may not agree
  • The C version of anon namespaces, static globals. static int i; at file scope in several cpp files will cause problems.
  • #define's in .cpp files - will affect source files that don't expect it

Modern compilers/linkers are fully able to optimize across translation units (link-time code generation) - I don't think you'll see any noticeable difference using this approach.


It would be better to profile your code for bottlenecks, and apply inlining and other speed hacks only where appropriate. Optimization should be performed with a scalpel, not with a shotgun.


Though it is not suggested, using #include statements for C files is essentially the same as appending the entire contents of the included file in the current one.

This way, if you include all of your files in one "master file" that file will be essentially compile as if all the source code were appended in it.


SQlite does that with its Amalgamation source file, have a look at: http://www.sqlite.org/amalgamation.html


Do you mind if I share some experience about what makes software slow, especially when the call tree gets bushy? The cost to enter and exit functions is almost totally insignificant except for functions that

  • do very little computation and (especially) do not call any further functions,

  • and are actually in use for a significant fraction of the time (i.e. random-time samples of the program counter are actually in the function for 10% or more of the time).

So in-lining helps performance only for a certain kind of function.

However, your supervisor could be right that software with layers of abstraction have performance problems.

  • It's not because of the cycles spent entering and leaving functions.

  • It's because of the temptation to write function calls without real awareness of how long they take.

A function is a bit like a credit card. It begs to be used. So it's no mystery that with a credit card you spend more than you would without it. However, it's worse with functions, because functions call functions call functions, over many layers, and the overspending compounds exponentially.

If you get experience with performance tuning like this, you come to recognize the design approaches that result in performance problems. The one I see over and over is too many layers of abstraction, excess notification, overdesigned data structure, stuff like that.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜