开发者

Could C++ or C99 theoretically be compiled to equally-portable C90?

This is a big question, so let me get a few things out of the way:

  1. Let's ignore the fact that some C++ features cannot be implemented in C (for example, supporting pre-main initialization for any global static object that is linked in).
  2. This is a thought experiment about what is theoretically possible. Please do not write to say how hard this would be (I know), or that I should do X instead. It's not a practical question, it's a fun theoretical one. :)

The question is: is it theoretically possible to compile C++ or C99 to C89 that is as portable as the original source code?

Cfront and Comeau C/C++ do compile C++ to C already. But for Comeau the C they produce is not portable, according to Comeau's sales staff. I have not used the Comeau compiler myself, but I speculate that the reasons for this are:

  1. Macros such as INT_MAX, offsetof(), etc. have already been expanded, and their expansion is platform-specific.
  2. Conditional compilation such as #ifdef has already been resolved.

My question is whether these problems could possibly be surmounted in a robust way. In other words, could a perfect C++ to C compiler be written (modulo the unsupportable C++ features)?

The trick is that you have to expand macros enough to do a robust parse, but then fold them back into their une开发者_JAVA百科xpanded forms (so they are again portable and platform-independent). But are there cases where this is fundamentally impossible?

It would be very difficult for anyone to categorically say "yes, this is possible" but I'm very interested in seeing any specific counterexamples: code snippets that could not be compiled in this way for some deep reason. I'm interested in both C++ and C99 counterexamples.

I'll start out with a rough example just to give a flavor of what I think a counterexample might look like.

#ifdef __SSE__
#define OP <
#else
#define OP >
#endif

class Foo {
 public:
  bool operator <(const Foo& other) { return true; }
  bool operator >(const Foo& other) { return false; }
};

bool f() { return Foo() OP Foo(); }

This is tricky because the value of OP and therefore the method call that is generated here is platform-specific. But it seems like it would be possible for the compiler to recognize that the statement's parse tree is dependent on a macro's value, and expand the possibilities of the macro into something like:

bool f() {
#if __SSE__
   return Foo_operator_lessthan(...);
#else
   return Foo_operator_greaterthan(...);
#endif
}


It is not only theoretically possible, but also practically trivial - use LLVM with a cbe target.


Theoretically all Turing-complete languages are equivalent.

You can compile C++ to an object code, and then decompile it to plain C or use an interpreter written in plain C.


In theory of course anything could be compiled to C first, but it is not practical to do so, specifically for C++.

For Foo operator< in your example it could be converted to:

bool isLess(const struct Foo * left, const struct Foo * right );

as a function signature. (If C90 doesn't allow bool then return int or char, and similarly old C versions that don't allow const, just don't use it).

Virtual functions are more tricky, you need function pointers.

struct A
{
   virtual int method( const std::string & str );
};

struct A
{
   int (*method)( struct A*, const struct string *);
};

a.method( "Hello" );


a.method( &a, create_String( "hello" ) ); 
          // and take care of the pointer returned by create_String


There are a number of subtle differences. For example, consider the line:

int i = UINT_MAX;

IIRC, in C++ this assigns an implementation-defined value. In C99 and C89, it assigns an implementation-defined value, or raises an implementation-defined signal. So if you see this line in C++, you can't just pass it through to a C89 compiler unmodified unless you make the non-portable assumption that it won't raise a signal.

Btw, if I've remembered wrong, think of your own example of differences in the standards relating to relatively simple expressions...

So, as "grep" says, you can do it because C89 is a rich enough language to express general computation. On the same grounds, you could write a C++ compiler that emits Perl source.

By the sound of your question, though, you're imagining that the compiler would make a set of defined modifications to the original code to make it compile as C89. In fact, even for simple expressions in C++ or C99, the C89 emitted might not look very much like the original source at all.

Also, I've ignored that there may be some parts of the standard libraries you just can't implement, because C89 doesn't offer the capabilities, so you'd end up with a "compiler" but not a complete implementation. I'm not sure. And as dribeas points out, low-level functions like VLAs present problems - basically you can't portably use the C89 "stack" as your C99 "stack". Instead you'd have to dynamically allocate memory from C89 to use for automatic variables required in the C99 source.


One big problem is exceptions. It might be possible to emulate them using setjmp, longjmp etc., but this would always be extremely inefficient compared to a real device-aware unwind engine.


http://www.comeaucomputing.com

There's no better proof of feasibility than a working example. Comeau is one of the most conforming c++03 compiler, and has support for many features of the upcoming standard, but it does not really generate binary code. It just translates your c++ code into c code that can be compiled with different C backends.

As for portability, I would assume it is not possible. There are some features that cannot be implemented without compiler specific extensions. The first example that comes to mind is C99 dynamic arrays: int n; int array[n]; that cannot be implemented in pure C89 (AFAIK) but can be implemented on top of extensions like alloca.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜