Are preprocessors obsolete in modern languages?
I'm making a simple compiler for a simple 开发者_如何学JAVApet language I'm creating and coming from a C background(though I'm writing it in Ruby) I wondered if a preprocessor is necessary.
What do you think? Is a "dumb" preprocessor still necessary in modern languages? Would C#'s conditional compilation capabilities be considered a "preprocessor"? Does every modern language that doesn't include a preprocessor have the utilities necessary to properly replace it? (for instance, the C++ preprocessor is now mostly obsolete(though still depended upon) because of templates.)
C's preprocessing can do really neat things, but if you look at the things it's used for you realize it's often for just adding another level of abstraction.
- Preprocessing for different operations on different platforms? It's basically a layer of abstraction for platform independence.
- Preprocessing for easily adding complex code? Abstraction because the language isn't generic enough.
- Preprocessing for adding extensions into your code? Abstraction because your code / your language isn't flexible enough.
So my answer is: you don't need a preprocessor if your language is high-level enough *. I wouldn't call preprocessing evil or useless, I just say that the more abstract the language gets, the less reason I can think for it needing preprocessing.
* What's high-level enough? That is, of course, entirely subjective.
EDIT: Of course, I'm only really referring to macros. Using preprocessors for interfacing with other code files or for defining constants is evil.
The preprocessor is a cheap method to provide incomplete metaprogramming facilities to a language in an ugly fashion.
Prefer true metaprogramming or Lisp-style macros instead.
A preprocesssor is not necessary. For real metaprogramming, you should have something like MetaML or Template Haskell or hygienic macros à la Scheme. For quick and dirty stuff, if your users absolutely must have it, there's always m4
.
However, a modern language should support the equivalent of C's #line
directives. Such directives enable the compiler to locate errors in the original source, even when that source is embedded in a parser generator or a lexer generator or a literate program. In other words,
- Design your language so as not to need a preprocessor.
- Don't bundle your language with a blessed preprocessor.
- But if others have their own reasons for using a preprocessor (parser generation is a popular one), provide support for accurate error messages.
I think that preprocessors are a crutch to keep a language with poor expressive power walking.
I have seen so much abuse of preprocessors that I hate them with a passion.
A preprocessor is a separated phase of compilation. While preprocessing can be useful in some cases, the headaches and bugs it can cause make it a problem.
In C, preprocessor is used mostly for:
- Including data - While powerful, the most common use-cases do not need such power, and "import"/"using" stuff(like in Java/C#) is much cleaner to use, and few people need the remaining cases;
- Defining constants - Why not just provide a "const" statement
- Macros - While C-style macros are very powerful(they can include statements such as returns), they also harm readability. Generics/Templates are cleaner and, while less powerful in a few ways, they are easier to understand.
- Conditional compilation - This is possibly the most legitimate use-case for preprocessors, but once again it's painful for readability. Separating platform-specific code in platform-specific source code and using common if statements ends up being better for readability.
So my answer is while powerful, the preprocessor harms readability and/or isn't the best way to deal with some problems. Newer languages tend to consider code maintenance very important, and for those reasons the preprocessor appears to be obsolete.
It's your language so you can build whatever capabilities you want into the language itself, without a need for a preprocessor. I don't think a preprocessor should be necessary, and it adds a layer of complexity and obscurity on top of a language. Most modern languages don't have preprocessors, and in C++ you only use it when you have no other choice.
By the way, I believe D handles conditional compilation without a preprocessor.
It depends on exactly what other features you offer. For example, if I have a const int N, do you offer for me to take N variables? Have N member variables, take an argument to construct all of them? Create N functions? Perform N operations that don't necessarily work in loops (for example, pass N arguments)? N template arguments? Conditional compilation? Constants that aren't integral?
The C preprocessor is so absurdly powerful in the proper hands, you'd need to make a seriously powerful language not to warrant one.
I would say that although you should avoid the pre-processor for most everything you normally do, it's still necessary.
For example, in C++, in order to write a unit-testing library like Catch, a pre-processor is absolutely necessary. They use it in two different ways: One for assertion expansion1, and one for nesting sections in test cases2.
But, the pre-processor shouldn't be abused to do compile-time computations in C++ where const-expressions and template meta-programming can be used.
Sorry, I don't have enough reputation to post more than two links, so I'm putting this here:
- github.com/philsquared/Catch/blob/master/docs/assertions.md
- github.com/philsquared/Catch/blob/master/docs/test-cases-and-sections.md
A others have pointed out, much of the functionality provided by the C preprocessor exists to compensate for limitations of the C language. For example, #include
and inclusion guards exist due to the lack of an import
statement, and macros largely exist due to the lack of inline functions and constant declarations.
However, the one feature of the C preprocessor that would still be beneficial in more modern languages is the #line
directive, since this supports the use of semantically-rich preprocessors/compilers. An an example, consider yacc
, which is a domain-specific-language (DSL) for writing a parser as a collection of BNF grammar rules. A central feature of yacc
is that chunks of C code called actions can be embedded within BNF rules. When a BNF rule is used to parse a piece of an input file, an action embedded in that rule will be executed. The yacc
compiler generates a C file that implements the BNF-based parser specified in the input file, and any actions that appeared in the input Yacc file are copied to the generated C file, but each action is surrounded by #line
directives. This use of #line
directives provides two important benefits.
First, if there is a syntax error in an action, then the error message generated by the C compiler can specify that the error occurred in, say, <input-file-to-yacc>, line 42
rather than in <output-file-generated-by-yacc>.c, line 3967
.
Second, the location information provided by #line
directives is copied into generated object code files created by the C compiler. So if you are using a debugger to investigate a program crash, if the bug that caused the crash originated from an action embedded in a Yacc input file, then the debugger will report the location of that buggy line of source code as being in <input-file-to-yacc>, line 42
rather than in <output-file-generated-by-yacc>.c, line 3967
.
The designers of C# and Perl wisely provided a #line
directive. Unfortunately, the designers of many other languages (Java being one that springs to mind) neglected to provide a #line
directive. Because of this, Yacc-like parser generators for many languages are unable to communicate the source location of embedded actions to compilers (and, therefore, to debuggers).
精彩评论