开发者

How does the assembler handle classes and objects, and how are they stored in RAM and the executable?

How does开发者_如何学Go an assembler handle classes and objects when a programme is compiled? And how is this stored in RAM and the executable file?

At first, memory is allocated according to the class' size, for example 20 bytes. In these 20 bytes, all variables of the class are stored. But there are two problems:

  1. How do pointers to classes work?
  2. How does a called method know which object it belongs to?

Could you explain this to me? (If using example code, I prefer Objective-C, C++ and x86 Assembly)


The assembler has no clue what a class is, it only assembles machine code, which the occasional macro tossed in. For all intents and purposes a class is merly a struct with an optional vftable, with all the handling and class 'special features' (virtualism, polymorphism, inheiritanc etc) being done in the intermediate stage, when IR code is created. Memory would be allocated the same as a struct, variable, array or any other data 'blob' (statically or dynamically, taking alignment, const'ness and packing into account), except for the support code to handle stack & static based dtor unwinding(done again at the IR level), ctors, and static initialization(though static initialization can happen for more than class objects). I suggest you give the dragon book a read through (the first eight chapters would cover it), to get a clearer picture of how a compiler and assembler work, seeing as these things are not handled by the assembler, but by the compiler front and/or back ends, depending on how the compiler an its IL are structured.


(2) Member functions are rewritten by the compiler. Consider class A as follows.

class A {
    int i;
public:
    A () : i (0) { }

    void f (int a, char *b) { i = a; }
}

Then what the compiler makes of A::f looks something like this (pseudocode):

void A::f (A * const this, int a, char *b) { this->i = a; }

Consider now a call to A::f.

A a;
a.f (25, "");

The compiler generates code similar to this (pseudocode):

A a;
A::f (&a, 25, "");

In other words, the compiler works the hidden this pointer into every non-static member function and each call receives a reference to the instance that it was called upon. This, in particular, means that you can invoke non-static member functions on NULL pointers.

A *a = NULL;
a->f (25, "");

The crash only occurs when you actually reference non-static class member variables. The resulting crash report also illustrates the answer to question (1). In many cases, you'll not crash on address 0, but an offset of that. That offset corresponds to the offset that the accessed variable has in the memory layout the compiler chose for class A (in this case, many compilers will actually offset it with 0x0 and class A will in memory not be distinguishable from struct A { int i; };). Basically, pointers to classes are pointers to the equivalent C struct. Member functions are implemented as free functions taking an instance reference as first argument. All and any access checks with regard to public, protected and private members is done upfront by the compiler, the generated assembly has no clue about any of those concepts. In fact, early versions of C++ are said to have been sets of clever C macros.

The memory layout (typically) changes a bit when you have virtual functions.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜