Why are the interpreters of all popular scripting languages written in C (if not in C at least not in C++)?
I recently asked a question on switching from C++ to C for writing an interpreter for speed and I got a comment from someone asking why on ear开发者_StackOverflow中文版th I would switch to C for that.
So I found out that I actually don't know why - except that C++ object oriented system has a much higher abstraction and therefore is slower.
- Why are the interpreters of all popular scripting languages written in C and not in C++?
If you want to tell me about some other language where the interpreter for it isn't in C, please replace all occurences of popular scripting languages
in this question with Ruby, Python, Perl and PHP
.
C is a very old language, and is thus supported on pretty much every system available. It is therefore a good choice for any project that needs to be ported everywhere.
Ruby dates back to 1995. If you were writing an interpreter in 1995, what were your options? Java was released in the same year. (And was painfully slow in v1.0 and in many ways, not really worth using)
C++ was not yet standardized, and compiler support for it was very sketchy. (it had also not yet made the transition to the "modern C++" that we use today. I think the STL was proposed for standardization around this time as well. It didn't actually get added to the standard until years later. And even after it was added, it took several more years for 1) compilers to catch up, and 2) people to get used to this generic programming style. Back then, C++ was an OOP language first and foremost, and in many cases, that style of C++ was quite a bit slower than C. (In modern C++ code, that performance difference is pretty much eliminated, partly through better compilers, and partly through better coding styles, less reliance on OOP constructs and more on templates and generic programming)
Python was started in 1991. Perl is even older (1987)
PHP is from 1995 as well, but additionally, and importantly, was created by a guy who knew virtually nothing of programming. (and yes, of course this has shaped the language in many important ways)
The languages you mention were started in C because C was the best bet for a portable, future-proof platform back then.
And while I haven't looked this up, I'm willing to bet that apart from the PHP case, which is shaped by incompetence more than anything, the language designers of the other languages chose C because they *already knew it. So perhaps the lesson is not "C is best", but "the language you already know is best"
There are other reasons why C is often chosen:
- experience and accessibility: C is a simple language that is fairly easy to pick up, lowering the barrier of entry. It's also popular, and there are a lot of experienced C programmers around. One reason why these languages have become popular might just be that it was easy to find programmers to help developing the interpreters. C++ is more complex to learn and use well. Today, that might not be so much of a problem, but 10 or 15 years ago?
- interoperability: Most languages communicate through C interfaces. Since your fancy new language is going to rely on components written in other languages (especially in early versions when the language itself is limited and has few libraries), it's always nice and simple to call a C function.So since we're going to have some C code anyway, it might be tempting to go all the way and just write the whole thing in C.
- performance: C doesn't get in your way much. It doesn't magically make your code fast, but it allows you to achieve good performance. So does C++, of course, or many other languages. But it's true for C as well.
- portability: Practically every platform has a C compiler. Until recently, C++ compilers were much more hit and miss.
These reasons don't mean that C is in fact a superior language for writing interpreters (or for anything else), they simply explain some of the motivations that have caused others to write in C.
I'd guess it's because C is pretty much the only language that has a reasonably standard compiler for almost every platform in existence.
I would hazard a guess that it's in part due to 1998 C++ not being standardized until 1998, making achieving portability that much harder.
All those languages you list were developed before that standardization.
Why are the interpreters of all popular scripting languages written in C and not in C++?
What makes you think that they are written in C? In my experience, the majority of implementations for the majority of scripting languages are written in languages other than C.
Here's a couple of examples:
Ruby
- BlueRuby: written in ABAP
- HotRuby: JavaScript
- Red Sun: ActionScript
- SmallRuby: Smalltalk/X
- MagLev: Ruby, GemStone Smalltalk
- Smalltalk.rb: Smalltalk
- Alumina: Smalltalk
- Cardinal: PIR, NQP, PGE
- RubyGoLightly: Go
- YARI: Io
- JRuby: Java
- XRuby: Java
- Microsoft IronRuby: C#
- the original IronRuby by Wilco Bauwer: C#
- Ruby.NET: C#
- NETRuby: C#
- MacRuby: Objective-C
- Rubinius: Ruby, C++
- MetaRuby: Ruby
- RubyVM: Ruby
Python
- IronPython: C#
- Jython: Java
- Pynie: PIR, NQP, PGE
- PyPy: Python, RPython
PHP
- P8: Java
- Quercus: Java
- Phalanger: C#
Perl6
- Rakudo: Perl6, PIR, NQP, PGE
- Pugs: Haskell
- Sprixel: JavaScript
- v6.pm: Perl5
- Elf: CommonLisp
JavaScript
- Narcissus: JavaScript
- Ejacs: ELisp
- Jint: C#
- IronJS: F#
- Rhino: Java
- Mascara (ECMAScript Harmony Reference Implementation): Python
- ECMAScript 4 Reference Implementation: Standard ML
The HotSpot JVM is written in C++, the Animorphic Smalltalk VM (from which HotSpot and V8 are derived) is written in C++, the Self VM (on which the Animorphic Smalltalk VM is based) is written in C++.
Interestingly enough, in many of the above cases, the implementations that are not written in C, are actually faster than the ones written in C.
As an example of two implementations that are written in C, take Lua and CPython. In both cases, they are actually written in a small subset of a very old version of C. The reason for this is that they want to be highly portable. CPython, for example, runs on platform for which a C++ compiler doesn't even exist. Also, Perl was written in 1989, CPython in 1990, Lua in 1993, SpiderMonkey in 1995. C++ wasn't standardized until 1998.
The complexity of C++ is great compared to that of C - many people consider it one of the most complex and error prone languages in existance.
Many of the features of C++ are problematic as well - the STL was standardized many years ago and it still lacks one great implementation.
OOP is certainly great, but it does not outweigh C++'s deficiencies in many scenarios.
Most known compiler books are written with examples in C. Also two of the major tools lexx (builds a lexer) and yacc (Translates a grammar to C) have support for C.
If the question is about why C and not C++ the answer comes down to the fact that when you implement a scripting language the C++ object model comes into your way. Its so restricted that you will not be able to use it for your own objects.
So you can only use this for the internals and they there you usually do not get enough benefits from C++ over the much simpler C language, which makes it easier to port and distribute.
The only problem when implementing a script language in C are missing coroutine support (you have to switch your stack pointer in some way) and most important there is no way to do exception handling without a lot of overhead (compared to C++).
精彩评论