开发者

C++ class ordering

I'm starting to play around with C++, coming from C and Objective C (and a bit of Java). I thought a good place to start building my skills is by writing a simple hash table from scratch, using linked lists for collisions. So I started out by writing the skeletons for each class.

class HashTable
{
   public:
     ...
   private:
     ...
};

class LinkedList
{
   public:
     ...
   private:
     Node *root;
};

class Node
{
  public:
    Node *next;
    string key;
    int value;
    Node()
    {
      ...
    }
};

The weird thing about this is, and this may not come as any surprise to c++ users, that this code wouldn't work. I would get an error like:

error: expected type-specifier before ‘Node’

with respect to the root node in Li开发者_开发知识库nkedList class.

When I simply reordered the classes so that it was Node{...}; LinkedList{...}; HashTable{...}; everything worked like a well oiled ice cream truck.

Now, I'm not one to question the design of C++, but is there any reason for this limitation? If I remember correctly, Obj. C's class's are essentially turned into tables and looked up on the fly. So what's reason for this behavior?


The requirement for declarations of this sort comes from two forces. The first is that it simplifies compiler design. Since types and variables have the same identifier structure, the compiler must know which it is encountering whenever it does parse an identifier. There are two ways to do this. One way would be to require that every identifier be declared before it may be used in other definitions. This means that the code must forward declare any name it intends to use before giving its definition. This is a very easy way to write a compiler with an otherwise ambiguous grammar.

The other way to do this is to handle it in multiple passes. Any time an undeclared identifier is encountered, it is skipped, and the compiler tries to resolve it once it's parsed the whole file. It turns out that the grammar of C++ makes this very difficult to do correctly. Compiler writers didn't want to have to go to this trouble, and so we have forward declarations.

The other reason is that you may actually want to have forward declarations so that recursive structures are determinite as an intrinsic property of the language. This is a bit more subtle. Suppose you had written a mutually recursive class network:

class Bar; // forward declaration
class Foo {
    Bar myBar;
};

class Bar {
    int occupySpace;
    Foo myFoo;
};

This is obviously impossible, because the occupySpace member would appear in an infinitely nested recursion. requiring that a forward declaration of all members in a definition provides a specific amount of information for this. In particular, it allows the compiler enough information to form a reference to a class, but not to instantiate the class (because it's size is not known). The forward declarations make this a feature of the syntax of the language, much like how lvalues are assignable as a feature of the language syntax rather than a more subtle semantic or run-time requirement.


The compiler throws the following error

error: expected type-specifier before ‘Node’

because it (the compiler) does not (yet) know

Node *root;

what Node is. (Since the Node is defined later.)

Two possible solutions:

  • Put the definition of Node class before LinkedList class (you already know this)

  • Forward declare the class Node before class LinkedList by putting this line

    class Node;

    This tells compiler that there exists a class Node.

After reading PigBen's comment, it seems you are questioning the rationale for this behavior. I am not a compiler person, but I think that this behavior makes it easy for parsing. To me, it is similar to having a function declaration available before its use.

PS: Nitpick, for LinkedList, a variable name head may be more suitable than root.


The reason for this behavior is historical. The file is processed sequentially. At the time it comes across the first reference to an identifier, that identifier needs to have already been declared.

The compiler does not process the whole file first.

Instead of re-ordering the class definitions, you can often get away with a forward declaration

class Node;

class List
{
    public:
    //...
    private:
    Node *root;
    //...
};

//...


There is no technical reason for this limitation - that is proven by the fact that compiler do what you evidently expect within the context of a single class. Still, removing the "limitation" does complicate compilers further, slow them down, increase their memory usage, and crucially - would not be backwards compatible (as matches in a more localised scope (namespace) would presumably be selected over other symbols seen earlier).

IMHO, it also makes code harder to read and understand. Being able to read from top to bottom and comprehend the code as you go is very useful, and encourages more thoughtful and structured expression of your problem solution.


Think about it the other way around; if it's reasonable to accept a class declared anywhere in the file as OK, why not a class declared in another file that has yet to be encountered?

If you go that far, then you end up not being able to give an error until you try to link the program, which may be far away from where the problem actually occurs.


You've declared that your LinkedList class has a type of Node but the compiler doesn't know what a Node is because its yet to be declared.

Just declare Node before LinkedList


This style means that the parser can run through the code fewer times. If you must identify every declared type, then run through the code again, you spend extra time parsing with the second run through the file.

Of course, as so many have pointed out, you can use a forward declaration.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜