开发者

Is the primary implementation of *any* popular programming language interpreter written in C++?

At the moment I am considering whether or not to rewrite a programming language interpreter that I maintain in C++. The interpreter is currently implemented in C.

But I was wondering, is the primary implementation—because, certainly, people have made versions of many interpreters using a language other than the one used by the original authors—of any popular programming language interpreter currently in use today written in C++?

And, if not, is there a good reason for not writing an interpreter 开发者_高级运维in C++? It is my understanding that C++ code, if written correctly, can be very portable and can potentially compile to run just as fast as compiled C code that does the same thing.


I wrote an interpreter in C++ (after many in C over the years) and I think that C++ is a decent language for that. About the implementation I only would travel back in time and change my choice of implementing the possibility to have several different interpreters running at the same time (every one multithreaded) simply because it made the code more complex and it's something that was never used. Multithreading is quite useful, but multiple instances of the interpreter was pointless...

However now my big regret is indeed the very fact I wrote that interpreter because now it's used in production with a fairly amount of code written and persons trained for it, and because the language is quite uglier and less powerful that python... but switching to python now would add costs. It has no bugs known to me... but yet it's worse than python and this is a bug (in addition to the error already made of paying the cost of writing it for no reason).

I simply should have used python initially instead (or lua or any other ready made interpreter that can easily be embedded and that has a reasonable licensing)... my only excuse for this is that I didn't know about python or lua at that time.

While writing an interpreter is a funny thing to do as a programming exercise I'd suggest you to avoid writing your own for production, especially (please don't take it personally) if the care that low level complexity requires is out of your reach (I find for example the presence of several memory leaks quite shocking).

C++ is still a low level language and while you can get some help for example on the memory handling side still the main assumption of the language is that your code is 100% right as no runtime error is going to help you (only undefined behaviour daemons).

If you missed this assumption of 100% correct code for C (a much simpler language) then I don't see how can you be confident you'll write correct code in C++ (a complexity monster in comparison). I suspect you would just end up with another buggy interpreter that you'll have to throw away.


If you wrote the current implementation and -as you say in your comment- it has:

clumsy symbol-handling and numerous memory leaks

Then rewriting in c++ is not going to help you. First try to understand why the current implementation goes wrong. On the other hand, if you are not the original developer then just choose whichever language you know best and port.

Update: I think sth's comment explains properly why many languages are implemented in C rather than C++. On the topic of complete rewrites, heed the words of Joel Spolsky.


Yes, many are. IIRC the Hotspot Java VM is written in C++, Haskells ghc, ...

As many here have noted You should really have a look at LLVM, it is a toolkit for building compiler, interpreter and virtual machines. You basically do the frontend work, (i.e. parsing your language + semantic analysis + codegen in LLVM IR) and LLVM will immediately give you building for different platforms, jit, optimization, compiling to native code, ... It also has some tools for parsing and AST, and error handling and notification (but maybe that is part of the Clang subproject.)


Most popular programming languages started to be created before there were many good C++ compilers available. Therefore the primary interpreters of those languages did not start out in C++, and once you have put a lot of work into a working interpreter, you usually don't throw that away just because it could now also be written in C++.

And if you start a new project for a interpreter written in C++ it is has to go a long way to become the primary implementation.


Google Chrome V8 Javascript Engine Implements ECMA-262 and it's extremely fast. Maybe you could rewrite it in C++ but you shold think about other features like implement a bytecode specification instead rewriting your automates in C++. Rewrite it will just help to organize the code (which is a great thing for group working), but nothing in performance.


The GNU foundation has just recently announced that all the new versions of gcc will be written in c++.


Tamarin - Adobe and Mozilla ECMAScript interpreter is written in C++. Being the one for which the original language author has responsibility, it might be considered the primary one (IIRC the ECMA reference implementation is written in OCaml, but that isn't actually used except as a reference)


Sun's Java implementation seems to be written in C++ mostly.


If memory leaks are your only problem with your current program then try valgrind on it. I've never had a memory leak in my software that valgrind could not track down for me. In fact it has saved my butt on so many occasions.

Here is a tutorial

http://www.cprogramming.com/debugging/valgrind.html


I don't think I can (or want) to give this a blanket "yes". I think it's a matter of pragmatism combined with needs of the individual language, and also depends on whether it is a compiled language (or bytecode-compiled) or interpreted, or...

If you are trying to write cross-platform code, you will find that the lowest common denominator is usually a C compiler (due to different CPU architectures, assemblers are not suitable for deploying to many platforms). Since C++ was coded to sit on top of most C infrastructure (like using name mangling to fit type overloads into something a C linker understands), it is usually the lowest-common-denominator OO language that's available even on embedded systems. That makes it a popular choice for people who want to write their language in a high-level, maintainable fashion.

Also, most programming languages have a reason for existing, want to solve problems in a different way (better necessarily means different, after all), which means they have rather unusual needs regarding what their code needs to be able to do, and don't use a lot of the support facilities another implementation language offers, because they wouldn't have enough control over it. So given you'll want to reimplement a lot of e.g. the object model and data types anyway, the low-level aspects of C++ actually are an advantage.

That said, many languages start out with their first version written in C++, a first simple compiler for instance, and then write the next version in that simple version ("bootstrapping"). This has the advantage that you can use your own language to extend it. To port it, they then modify only their compiler to cross-compile to the desired platform, then build the compiler with this cross-compiler, and the result is a native version of the full language for the new platform.

The languages that tend to not do this are usually mainly the scripting languages, which tend to remain as interpreted C++-implemented languages (Though others have mentioned popular exceptions).

Another common reason to pick C++ is existing infrastructure. E.g. if you want to bind to existing system frameworks, you often need to drop down to C++, or if you want to take advantage of existing compiler backends (like LLVM, which is written in C++), or even if they only use C, often C++ is the most suitable OO-like implementation language that can easily talk to the C parts of a system.

So the question you want to ask yourself is likely: What are my needs, and what language best suits those?

Some languages are simply preprocessors on another language (C++ and Objective-C both started out as preprocessors on top of C). They add their own syntax or features, translate those into the implementation language, then compile that modified code using an existing compiler. If a language already does all you want, that may be a better approach, and let you leverage the experience of the engineers working on that other language, combining your work-hours into more than you alone could provide.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜