C tail call optimization
I often hear开发者_开发百科 people say that C doesn't perform tail call elimination. Even though it's not guaranteed by the standard, isn't it performed in practice by any decent implementation anyhow? Assuming you're only targeting mature, well implemented compilers and don't care about absolute maximum portability to primitive compilers written for obscure platforms, is it reasonable to rely on tail call elimination in C?
Also, what was the rationale for leaving tail call optimization out of the standard?
Statements like "C doesn't perform tail call elimination" make no sense. As you correctly noted yourself, things like this depend entirely on the implementation. And yes, any decent implementation can easily turn tail-recursion into [an equivalent of] a cycle. Of course, C compilers do not normally give any guarantees about what optimizations will and what optimizations will not happen in each particular piece of code. You have to compile it and see for yourself.
Although modern compilers MAY do tail-call optimization if you turn on optimizations, your debug builds will probably run without it so that you can get stack traces and step in/out of code and wonderful things like that. In this situation, tail call optimization is not desired.
Since tail call optimization is not always desirable, it doesn't make sense to mandate it to compiler writers.
I think that tail call optimizations need to be guaranteed only where a lot of recursion is anticipated or required; that is, in languages that encourage or enforce a functional programming style. (With these kinds of languages, you may find that for
or while
loops are either strongly discouraged, perceived as inelegant, or probably even completely absent from the language, so you would resort to recursion for all these reasons, and probably more.)
The C programming language (IMHO) clearly was not designed with functional programming in mind. There's all kinds of loop constructs that are generally used in favour of recursion: for
, do .. while
, while
. In such a language, it wouldn't make much sense to prescribe tail call optimization in a standard, because it's not strictly required to guarantee working programs.
Contrast this again with a functional programming language that doesn't have while
loops: This means you will need recursion; which in turn means that the language must make sure that, with many iterations, stack overflows won't become a problem; thus the official standard for such a language might choose to prescribe tail call optimization.
P.S.: Note a slight flaw in my argument for tail call optimization. Towards the end of, I mention stack overflows. But who says that function calls always require a stack? On some platforms, function calls might be implemented in a totally different way, and stack overflows would never even be a problem. This would be yet another argument against prescribing tail call optimization in a standard. (But don't get me wrong, I can see the merits of such optimizations, even without a stack!)
To answer you last question: The standard should definitely not make any statements about optimization. There may be environments where it is more or less difficult to implement.
For those who like proof by construction, here is godbolt doing a nice tail call optimisation and inline: https://godbolt.org/z/DMleUN
However, if you crank the optimization to -O3 (or no doubt if you wait a few years or use a different compiler), the optimisation totally removes the loop/recursion.
Here is an example that optimizes down to a single instruction even with -O2: https://godbolt.org/z/CNzWex
The language standard defines how the language behaves, not how compilers are required to be implemented. Optimization isn't mandated because it isn't always wanted. Compilers provide options so that the user can enable optimizations if they so desire them, and can likewise turn them off. Compiler optimization can affect the ability to debug code (it becomes harder to match C to assembly in a line-by-line fashion), so it makes sense to only perform optimization at the user's request.
There are situations, where tail call optimisation would potentially break the ABI or at least be very difficult to implement in a semantic-preserving way. Think of position independent code in shared libraries for instance: Some platforms allow programs to link dynamically against libraries in order to save main memory when various different applications all depend on the same functionality. In such cases, the library is loaded once and mapped into each of the program’s virtual memory as if it was the only application on a system. On UNIX and also on some other systems, this is achieved by using position independent code for libraries, so that addressing is relative to an offset, rather than absolute to a fixed address space. On many platforms, however, position independent code must not be tail call optimised. The problem involved is that the offsets for navigating through the program have to be kept in registers; on Intel 32-bit, %ebx
is used which is a callee saved register; other platforms follow that notion. Unlike functions using normal calls, those deploying tail calls have to restore the callee saved registers before branching off to the subroutine, not when they return themselves. Normally, that is no problem, because at this point, the top most calling function does not care for the value stored in %ebx
, but the position independent code depends on this value upon each and every jump, call or branch command.
Other problems could be pending clean-ups in object-oriented languages (C++), meaning that the last call in a function isn't actually the last call - the clean-ups are. Hence, the compiler usually does not optimise, when this is the case.
Also setjmp
and longjmp
are problematic, of course, since this effectively means a function can finish execution more than once, before it actually finishes. Difficult or impossible to optimise at compile time!
There's probably more technical reasons one can think of. These are just some considerations.
It is common for compilers to recognize situations where a function won't need to do anything after calling another, and replace that call with a jump. Many cases where that can be done safely are easy to recognize, and such cases qualify as "safe low-hanging fruit". Even on compilers that can perform such optimization, however, it may not always be obvious when it should or will be performed. Various factors may make the cost of a tail call greater than that of a normal call, and these factors may not always be predictable. For example, if a function ends with return foo(1,2,3,a,b,c,4,5,6);
, it may be practical to copy a, b, and c into registers, clean up the stack and then prepare the arguments for passing, but there may not be enough registers available to handle foo(a,b,c,d,e,f,g,h,i);
likewise.
If a language had a special "tail call" syntax that required that compilers given that make a tail call if at all possible, and refuse compilation otherwise, code could safely assume such functions could be nested arbitrarily deep. When using ordinary call syntax, however, there's no general way to know whether a compiler would be able to perform a tail call more cheaply than an "ordinary" one.
精彩评论