开发者

Why is including a header file such an evil thing?

I have seen many explanations on when to use forward declarations over including header files, but few of them go into why it is important to do so. Some of the reasons I have see开发者_开发百科n include the following:

  • compilation speed
  • reducing complexity of header file management
  • removing cyclic dependencies

Coming from a .net background I find header management frustrating. I have this feeling I need to master forward declarations, but I have been scrapping by on includes so far.

Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?

How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?

I can buy the argument for reduced complexity, but what would a practical example of this be?


"to master forward declarations" is not a requirement, it's a useful guideline where possible.

When a header is included, and it pulls in more headers, and yet more, the compiler has to do a lot of work processing a single translation module.

You can see how much, for example, with gcc -E:

A single #include <iostream> gives my g++ 4.5.2 additional 18,560 lines of code to process.

A #include <boost/asio.hpp> adds another 74,906 lines.

A #include <boost/spirit/include/qi.hpp> adds 154,024 lines, that's over 5 MB of code.

This adds up, especially if carelessly included in some file that's included in every file of your project.

Sometimes going over old code and pruning unnecessary includes improves the compilation dramatically just because of that. Replacing includes with forward declarations in the translation modules where only references or pointers to some class are used, improves this even further.


Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?

It cannot because, unlike some other languages, C++ has an ambiguous grammar:

int f(X);

Is it a function declaration or a variable definition? To answer this question the compiler must know what does X mean, so X must be declared before that line.


Because when you're doing something like this :

bar.h :

class Bar {
  int foo(Foo &);
}

Then the compiler does not need to know how the Foo struct / class is defined ; so importing the header that defines Foo is useless. Moreover, importing the header that defines Foo might also need importing the header that defines some other class that Foo uses ; and this might mean importing the header that defines some other class, etc.... turtles all the way.

In the end, the file that the compiler is working against is almost like the result of copy pasting all the headers ; so it will get big for no good reason, and when someone makes a typo in a header file that you don't need (or import , or something like that), then compiling your class starts to take waaay too much time (or fail for no obvious reason).

So it's a good thing to give as little info as needed to the compiler.


How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?

1) reduced disk i/o (fewer files to open, fewer times)

2) reduced memory/cpu usage most translations need only a name. if you use/allocate the object, you'll need its declaration.

this is probably where it will click for you: each file you compile compiles what is visible in its translation.

a poorly maintained system will end up including a ton of stuff it does not need - then this gets compiled for every file it sees. by using forwards where possible, you can bypass that, and significantly reduce the number of times a public interface (and all of its included dependencies) must be compiled.

that is to say: the content of the header won't be compiled once. it will be compiled over and over. everything in this translation must be parsed, checked that it's a valid program, checked for warnings, optimized, etc. many, many times.

including lazily only adds significant disk/cpu/memory increase, which turns into intolerable build times for you, while introducing significant dependencies (in non-trivial projects).

I can buy the argument for reduced complexity, but what would a practical example of this be?

unnecessary includes introduce dependencies as side effects. when you edit an include (necessary or not), then every file which includes it must be recompiled (not trivial when hundreds of thousands of files must be unnecessarily opened and compiled).

Lakos wrote a good book which covers this in detail:

http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1?ie=UTF8&s=books&qid=1304529571&sr=8-1


Header file inclusion rules specified in this article will help reduce the effort in managing header files.


I used forward declarations simply to reduce the amount of navigation between source files done. e.g. if module X calls some glue or interface function F in module Y, then using a forward declaration means the writing the function and the call can be done by only visiting 2 places, X.c and Y.c not so much of an issue when a good IDE helps you navigate, but I tend to prefer coding bottom-up creating working code then figuring out how to wrap it rather than through top down interface specification.. as the interfaces themselves evolve it's handy to not have to write them out in full.

In C (or c++ minus classes) it's possible to truly keep structure details Private by only defining them in the source files that use them, and only exposing forward declarations to the outside world - a level of black boxing that requires performance-destroying virtuals in the c++/classes way of doing things. It's also possible to avoid needing to prototype things (visiting the header) by listing 'bottom-up' within the source files (good old static keyword).

The pain of managing headers can sometimes expose how modular your program is or isn't - if its' truly modular, the number of headers you have to visit and the amount of code & datastructures declared within them should be minimized.

Working on a big project with 'everything included everywhere' through precompiled headers won't encourage this real modularity.

module dependancies can correlate with data-flow relating to performance issues, i.e. both i-cache & d-cache issues. If a program involves many modules that call each other & modify data at many random places, it's likely to have poor cache-coherency - the process of optimizing such a program will often involve breaking up passes and adding intermediate data.. often playing havoc with many'class diagrams'/'frameworks' (or at least requiring the creation of many intermediates datastructures). Heavy template use often means complex pointer-chasing cache-destroying data structures. In its optimized state, dependancies & pointer chasing will be reduced.


I believe forward declarations speed up compilation because the header file is ONLY included where it is actually used. This reduces the need to open and close the file once. You are correct that at some point the object referenced will need to be compiled, but if I am only using a pointer to that object in my other .h file, why actually include it? If I tell the compiler I am using a pointer to a class, that's all it needs (as long as I am not calling any methods on that class.)

This is not the end of it. Those .h files include other .h files... So, for a large project, opening, reading, and closing, all the .h files which are included repetitively can become a significant overhead. Even with #IF checks, you still have to open and close them a lot.

We practice this at my source of employment. My boss explained this in a similar way, but I'm sure his explanation was more clear.


How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?

Because include is a preprocessor thing, which means it is done via brute force when parsing the file. Your object will be compiled once (compiler) then linked (linker) as appropriate later.

In C/C++, when you compile, you've got to remember there is a whole chain of tools involved (preprocessor, compiler, linker plus build management tools like make or Visual Studio, etc...)


Good and evil. The battle continues, but now on the battle field of header files. Header files are a necessity and a feature of the language, but they can create a lot of unnecessary overhead if used in a non optimal way, e.g. not using forward declarations etc.

How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?

I can buy the argument for reduced complexity, but what would a practical example of this be?

Forward declarations are bad ass. My experience is that a lot of c++ programmers are not aware of the fact that you don't have to include any header file, unless you actually want to use some type, e.g. you need to have the type defined so the compiler understands what you want to do. It's important to try and refrain from including header files in other header files.

Just passing around a pointer from one function to another, only requires a forward declaration:

// someFile.h
class CSomeClass;
void SomeFunctionUsingSomeClass(CSomeClass* foo);

Including someFile.h does not require you to include the header file of CSomeClass, since you are merely passing a pointer to it, not using the class. This means that the compiler only needs to parse one line (class CSomeClass;) instead of an entire header file (that might be chained to other header files etc etc).

This reduces both compile time and link time, and we are talking big optimizations here if you have many headers and many classes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜