Explain "C fundamentally has a corrupt type system"
In the book Coders at Work (p355), Guy Steele says of C++:
I think the decision to be backwards-compatible with C is a fatal flaw. It’s just a set of difficulties that can’t be overcome. C fundamentally has a corrupt type system. It’s good enough to help you avoid some difficulties but it’s not airtight and you can’t count on it
What does he mean by describing the type system as "corrupt"?
Can you demonstrate with a simple example in C?
Edit:
The quote sounds polemic, but I'm not trying to be. I simply want to understand what he means.
Pleas开发者_运维知识库e give examples in C not C++. I'm interested in the "fundamentally" part too :)
The obvious examples in C of non-type-safety simply come from the fact you can cast from void * to any type without having to explicitly cast so.
struct X
{
int x;
};
struct Y
{
double y;
};
struct X xx;
xx.x = 1;
void * vv = &xx;
struct Y * yy = vv; /* no need to cast explicitly */
printf( "%f", yy->y );
Of course printf itself is not exactly typesafe.
C++ is not totally typesafe.
struct Base
{
int b;
};
struct Derived : Base
{
int d;
Derived()
{
b = 1;
d = 3;
}
};
Derived derivs[50];
Base * bb = &derivs[0];
std::cout << bb[3].b << std::endl;
It has no problem converting the Derived* to a Base* but you run into problems when you try using the Base* as an array as it will get the pointer arithmetic all wrong and whilst all the b values are 1 you may well get a 3 (As the ints will go 1-3-1-3 etc)
Basically you can cast any data type to any data type
struct SomeStruct {
void* data;
};
struct SomeStruct object;
*( (int*) &object ) = 10;
and noone catches you.
char buffer[42];
FunctionThatDestroysTheStack(buffer); // By writing 43 chars or more
The C type system does have some problems. Things like implicit function declaration and implicit conversion from void*
can SILENTLY break type safety.
C++ fixes pretty much all of these holes. The C++ type system is NOT backwards compatible with C, it's only compatible with well-written typesafe C code.
Furthermore, the people arguing against C++ typically point you to Java or C# as the "solution". Yet Java and C# do have holes in their type system (array covariance). C++ doesn't have this problem.
EDIT: Examples, in C++, attempting to use array covariance that would (improperly) be allowed by the Java and C# type systems.
#include <stdlib.h>
struct Base {};
struct Derived : Base {};
template<size_t N>
void func1( Base (&array)[N] );
void func2( Base** pArray );
void func3( Base*& refArray );
void test1( void )
{
Base b[40];
Derived d[40];
func1(b); // ok
func1(d); // error caught by C++ type system
}
void test2( void )
{
Base* b[40] = {};
Derived* d[40] = {};
func2(b); // ok
func2(d); // error caught by C++ type system
func3(b[0]); // ok
func3(d[0]); // error caught by C++ type system
}
Results:
Comeau C/C++ 4.3.10.1 (Oct 6 2008 11:28:09) for ONLINE_EVALUATION_BETA2
Copyright 1988-2008 Comeau Computing. All rights reserved.
MODE:strict errors C++ C++0x_extensions
"ComeauTest.c", line 19: error: no instance of function template "func1" matches
the argument list
The argument types that you used are: (Derived [40])
func1(d); // error caught by C++ type system
^
"ComeauTest.c", line 28: error: argument of type "Derived **" is incompatible with
parameter of type "Base **"
func2(d); // error caught by C++ type system
^
"ComeauTest.c", line 31: error: a reference of type "Base *&" (not const-qualified)
cannot be initialized with a value of type "Derived *"
func3(d[0]); // error caught by C++ type system
^
3 errors detected in the compilation of "ComeauTest.c".
This doesn't mean that there are no holes at all in the C++ type system, but it does show that you can't silently overwrite a pointer-to-Derived with a pointer-to-Base like Java and C# allow.
You'd have to ask him what he meant to get a definitive answer, or perhaps provide more context for that quote.
However, it is pretty clear that if this is a fatal flaw for C++, the disease is chronic, not acute - C++ is thriving, and continually evolving as evidenced by ongoing Boost and C++0x efforts.
I don't even think about C and C++ as coupled any more - a few weeks on the respective fora here quickly cures one of any confusion over the fact that they are two different languages, each with its own strengths and foibles.
IMHO the "most broken" part of the C type system is that the concepts of
- values/parameters that are optional
- mutable values/pass-by-reference
- arrays
- non-POD function parameters
are all mapped to the single language concept "pointer". That means, if you get a function parameter of type X*
, it might be an optional parameter, it might be expected that the function changes the value pointed to by X*
, it might be that there are multiple instances of X
after the one pointed to (it's open how many - the number could be passed as a separate parameter, or some kind special "terminator" value might mark the end of the array, as in nul-terminated strings). Or, the parameter might simply by a single structure, that you're not expected to change, but it's cheaper to pass it by reference.
If you get something of type X**
, it might be an array of optional values, or it might be an array of simple values and you're expected to change it. Or it might be a 2d jagged array. Or an optional value passed by reference.
In contrast, take the ML family of languages (F#, OCaML, SML). Here these concepts map to separate language constructs:
- values that are optional have the type
X option
- values that are mutable/pass by reference have the type
X ref
- arrays have the type
X array
- and non-POD types can be passed like PODs. Because they aren't mutable, the compiler can pass them by reference internally, but you don't need to know about that implementation detail
And you can of course combine those, i.e. int optional ref
is a mutable value, that can be set to nothing or some integer value. int ref optional
on the other hand is an optional mutable value; it can be nothing (and noone can change it) or it can be some mutable int (and you can change it to any other mutable it, but not to nothing).
These distinctions are very sublte, but you have to make them whether you program in ML or not. In C you have to make the same distinctions, but they're not explicitly stated in the type system. You have to read the documentation very carefully, or you might introduce sublte (read: hard to find) bugs if you misunderstand which kind of pointer usage is meant when.
Here, "corrupt" means that it is not "strict", leading to never-ending delight in C++ (because of the many custom types (objects) and overloaded operators, casting becomes a superior nuisance in C++).
The attack against C comes in regard to its MISPLACED USAGE as a strict OOP basis.
C has never been designed to limit coders, hence, maybe the frustration of Academia (and the flamboyant splendor of the ++ given to the World by B.S.).
"I invented the term Object-Oriented, and I can tell you I did not have C++ in mind"
(Alan Kay)
精彩评论