开发者

Why are most of the biggest open source projects in C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Update the question so it focuses on one problem only by editing this post.

Closed 7 years ago.

Improve this question 开发者_Go百科

I'm having a debate with a friend and we're wondering why so many open source projects have decided to go with C instead of C++. Projects such as Apache, GTK, Gnome and more opted for C, but why not C++ since it's almost the same?

We're precisely looking for the reasons that would have led those projects (not only those I've listed but all C projects) to go with C instead of C++. Topics can be performance, ease of programming, debugging, testing, conception, etc.


C is very portable, much more than C++ was 10 years ago.

Also, C is very entrenched in the Unix tradition. Read more in 'The Art of Unix Programming', about Unix and OO in general, and about specific languages on unix (including C and C++).


There are numerous counter examples: everything based on Qt for one.

Also, on my Debian testing system:

edd@ron:~$ apt-cache rdepends libstdc++6|wc -l
4101

So that's 4101 packages depending on the basic C++ library. For comparison, I get about 14,982 for libc6 or roughly 3.6 as many. But it is not if there aren't any C++ projects in Open Source land.

Edit: Thinko on my part: as the C++ packages also depend on libc6, the ratio really is

(14982 - 4101)/4101 = 2.65

so there are roughly 2 1/2 times as many packages implemented in C than there are in C++.


Eric Raymond's wonderful book "The Art of Unix Programming" has some reflections on this issue (the whole book is well worth reading in either the paper or free online editions, I'm just pointing to the relevant section -- Eric was involved with the coining and introduction of the term "open source", and is always well worth reading;-0).

Summarizing that section, Raymond claims that "OO languages show some tendency to suck programmers into the trap of excessive layering" and Unix programmers (and by extension open-source programmers) resist that trap of "thick glue".

Later in the book, you find some considerations specifically about C++, such as "It may be that C++'s realization of OO is particularly problem-prone". Whether you agree or not, the whole text is well worth reading (I can hardly do it justice here!-), and rich with bibliography pointing you to many other relevant studies and publications.


Linus Torvalds has ranted several times on the topic of C++ -- he uses C for git, and of course the Linux kernel is mostly C:

  • on C++ and git (warning: don flame-retardant first)
  • an interview with Linus from 1998

You can easily find more of these, and while it's in his nature to get a bit flamey about these things, there are some valid points.

One of the more interesting (from where I'm sitting, anyway) is the observation that C++ compilers and libraries were (and to some degree are) a lot more buggy than the corresponding C compilers. This stands to reason given the relative complexities of the two languages.

It smells a little of "not invented here" (NIH) syndrome, but when you have the entire Linux kernel developer base, you can sometimes afford to reinvent things "The Right Way".


A lot of the projects started before C++ was standardized, so C was the obvious choice and a change later would be hard. C was standardized about a decade before C++, and has been more nearly portable for even longer. So, it was largely a pragmatic decision at the time, inspired in part by the Unix heritage of using C for most code.


C++ is a mess. It is overly complicated language, so complicated that only few people can say that they know all the bits. And fewer compilers which really complies to C++ standard.

So I think the reason is simplicity and portability.

If you want higher-level and object-oriented programming, then I think C++ is just competed with others like Python. (Note that I programmed in C++ few years, it's fast and has some features from higher-level languages that speeds up development, no offence.)


I have worked on a few C++ projects in my time, all of which have ended in tears one way or the other. At the most fundamental level, the truth is that people can't be trusted. They can't be trusted to write good code, they can't be trusted to debug it, and they certainly can't be trusted to understand it when they have to come back and modify it again weeks/months later.

C code doesn't have a lot of the weird stuff in C++ that makes it hard to debug (constructors/destructors, anything that happens with static global objects during cpp_initialize() time, etc.). That just makes it easier to deal with when developing and maintaining a big project.

Maybe I'm a luddite, but every time someone says "C++" around me I get shivers.


Some people have mentioned portability, but in this day, the portability of C++ isn't much of an issue (it runs on anything GCC runs on, which is essentially anything). However, portability is more than just architecture-to-architecture or OS-to-OS. In the case of C++, it includes compiler-to-compiler.

Let's discuss ABI, or Application Binary Interface. This basically means "how your code translates into assembly." In C, when you write:

int dostuff(const char *src, char *dest);

You know that you're making a symbol in your object file called _dostuff (C global names are all prefixed by an underscore in the resultant assembly). But in C++, when you write this:

int dostuff(const char *src, char *dest);
int dostuff(const char *src, char *dest, size_t len);

Or even:

int dostuff(std::string src, std::string dest);

All bets are instantly off. You now have two distinct functions, and the compiler has to make each, and has to give each a unique name. So C++ allows (where I believe C doesn't) name mangling, which means those two functions might get translated to _dostuff_cp_cp and _dostuff_cp_cp_s (so that each version of the function that takes a different number of arguments has a different name).

The problem with this is (and I consider this a huge mistake, even though it's not the only problem with cross-compiler portability in C++) that the C++ standard left the details of how to mangle these names up to the compiler. So while one C++ compiler may do that, another may do _cp_cp_s_dostuff, and yet another may do _dostuff_my_compiler_is_teh_coolest_char_ptr_char_ptr_size_t. The problem is exacerbated (always find a way to sneak this word into anything you say or write) by the fact that you have to mangle names for more than just overloaded functions - what about methods and namespaces and method overloading and operator overloading and... (the list goes on). There is only one standard way to ensure that your function's name is actually what you expect it to be in C++:

extern "C" int dostuff(const char *src, char *dest);

Many applications need to have (or at least find it very useful to have) a standard ABI provided by C. Apache, for example, couldn't be nearly as cross-platform and easily extensible if it was in C++ - you'd have to account for the name mangling of a particular compiler (and a particular compiler version - GCC has changed a few times in its history) or require that everyone use the same compiler universally - which means that, every time you upgrade your C++ compiler with a backwards incompatible name-mangling scheme, you have to recompile all your C++ programs.

This post turned into something of a monster, but I think it illustrates a good point, and I'm too tired to try to trim it down.


As someone who dislikes C++ and would pick C over it any day, I can at least give you my impressions on the topic. C++ has several attributes that make it unappealing:

  • Complicated objects. C++ has tons of ability to speed up OO, which makes the language very complex.
  • Nonstandard syntax. Even today most C++ compilers support quirks that make ensuring successful and correct compilation between compilers difficult.
  • Nonstandard libraries. Compared to C libraries, C++ libraries are not nearly as standardized across systems. Having had to deal with Make issues associated with this before I can tell you that going with C is a big time saver.

That said, C++ does have the benefits of supporting objects. But when it comes down to it, even for large projects, modularity can be accomplished without objects. When you add in the fact that essentially every programmer who might contribute code to any project can program C, it seems hard to make the choice to go with anything else if you need to write your code that close to the metal.

All that said, many projects jump over C++ and go to languages like Python, Java, or Ruby because they provide more abstraction and faster development. When you add in their ability to support compiling out to/loading in from C code for parts that need the performance kick, C++ loses what edge it could have had.


If you look at recent open source projects, you'll see many of them use C++. KDE, for instance, has all of its subprojects in C++. But for projects that started a decade ago, it was a risky decision. C was way more standardized at the time, both formally and in practice (compiler implementations). Also C++ depends on a bigger runtime and lacked good libraries at that time. You know that personal preference plays a big role in such decision, and at that time the C workforce in UNIX/Linux projects was far bigger than C++, so the probability that the initial developer(s) for a new project were more comfortable with C was greater. Also, any project that needs to expose an API would do that in C (to avoid ABI problems), so that would be another argument to favor C. And finally, before smart pointers became popular, it was much more dangerous to program in C++. You'd need more skilled programmers, and they would need to be overly cautions. Although C has the same problems, its simpler data structures are easier to debug using bounds checking tools/libraries.

Also consider that C++ is an option only for high-level code (desktop apps and the like). The kernel, drivers, etc. are not viable candidates for C++ development. C++ has too much "under the hood" behavior (constructor/destructor chains, virtual methods table, etc) and in such projects you need to be sure the resulting machine/assembly code won't have any surprises and doesn't depend on runtime library support to work.


One important aspect in addition to others that will doubtless be mentioned is that C is easier to interface with other languages, so in the case of a library intended to be widely useful, C may be chosen even nowadays for this purpose.

To take examples I am familiar with, the toolkit GTK+ (in C) has robust OCaml bindings, while Qt and Cocoa (respectively in C++ and Objective C) only have proof-of-concepts for such bindings. I believe that the difficulty to interface languages other than C with OCaml is part of the reason.


One reason might be that the GNU coding standards specifically ask you to use C. Another reason I can think of is that the free software tools work better with C than C++. For example, GNU indent doesn't do C++ as well as it does C, or etags doesn't parse C++ as well as it parses C.


I can list a couple more reasons

  1. C code produces more compact object code. Try to compile 'Hello World' as C and C++ program and compare the size of the executable. May not be too relevant today but definitely was a factor 10+ years ago
  2. It is much easier to use dynamic linking with C programs. Most of the C++ libraries still expose entry points through C interface. So instead of writing a bridge between C++ and C why not to program the whole thing in C?


First of all, some of the biggest open source projects are written in C++: Open Office, Firefox, Chrome, MySQL,...

Having said that, there are also many big projects written in C. Reasons vary: they may have been started when C++ was not standardized yet, or the authors are/were more comfortable with C, or they hoped that the easier learning curve for C would attract more contributors.


If correctly implemented C is very fast and very portable and the compilers are there

C++ is different for each compiler available, the libraries dont agree, the standards don´t match.


You can read Dov Bulka to find what not to do in cpp, you can read tesseract ocr at Google code, you can read lots of things - most of which depend on where you are to determine which code linguistic is superior. Where did you read that c has more source code up in open source than cpp? Well of course you read that in a c forum. That's where. Go to some other programming linguistic. Do the same search, you will find that that code has more open source.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜