Overcoming C limitations for large projects
One aspect where C shows its age is the encapsulation of code. Many modern languages has classes, namespaces, packages... a much more convenient to org开发者_运维百科anize code than just a simple "include".
Since C is still the main language for many huge projects. How do you to overcome its limitations?
I suppose that one main factor should be lots of discipline. I would like to know what you do to handle large quantity of C code, which authors or books you can recommend.
- Separate the code into functional units.
- Build those units of code into individual libraries.
- Use hidden symbols within libraries to reduce namespace conflicts.
Think of open source code. There is a huge amount of C code in the Linux kernel, the GNU C library, the X Window system and the Gnome desktop project. Yet, it all works together. This is because most of the code does not see any of the other code. It only communicates by well-defined interfaces. Do the same in any large project.
Some people don't like it but I am an advocate of organzing my structs and associated functions together as if they are a class where the this pointer is passed explicitly. For instance, combined with a consistent naming convention to make the namespace explicit. A header would be something like:
typedef struct foo {
int x;
double y;
} FOO_T
FOO_T * foo_new();
int foo_set_x(FOO_T * self, int arg1);
int foo_do_bar(FOO_T * self, int arg1);
FOO_T * foo_delete(FOO_T * self);
In the implementation, all the "private" functions would be static. The downside of this is that you can't actually enforce that the user not go and muck with the members of the struct. That's just life in c. I find this style though makes for nicely reusable C types.
A good way you can achieve some encapsulation is to declare internal methods or variables of a module as static
As Andres says, static
is your friend. But speaking of friends... if you want to be able to separate a library in two files, then some symbols from one file that need to be seen in the other can not be static
.
Decide of some naming conventions: all non-static symbols from library foo
start with foo_
. And make sure they are always followed: it is precisely the symbols for which it seems constraining ("I need to call it foo_max
?! But it is just max
!") that there will be clashes.
As Zan says, a typical Linux distribution can be seen as a huge project written mostly in C. It works. There are interfaces, and large-subprojects are implemented as separate processes. An implementation in separate processes helps for debugging, for testing, for code reuse, and it provides a second hierarchy in addition to the only one that exists at link level. When your project becomes large enough, it may start to make sense to put some of the functionalities in separate processes. Something already as specialized as a C compiler is typically implemented as three processes: pre-processor, compiler, assembler.
If you can control the project (e.g. in-house or you pay someone else to do it) you can simply set rules and use reviews and tools to enforce them. There is no real need for the language to do this, you can for instance demand that all functions usable outside a module (=a set of files, don't even need to be a separate) must be marked thus. In effect, you would force the developers to think about the interfaces and stick with them.
If you really want to make the point, you could define macros to show this as well, e.g.
#define PUBLIC
#define PRIVATE static
or some such.
So you are right, discipline is the key here. It involves setting the rules AND making sure that they are followed.
精彩评论