Typedefs, (binary) Code duplication and Object File
Suppose I compile a source file file which contains this piece of code,
struct Point
{
int x;
int y;
};
struct Size
{
int x;
int y;
};
Since Point
and Size
is exactly same (in terms of memory layout of it's members), would the compiler generate duplicate code (one for each struct
) in the object file? That is my first question.
Now, lets remove the struct Size
from the source code, and define it using typedef
instead, like this,
typedef Point Size;
What would the compile do now? Would it duplicate code (since typedef isn't just renaming, rather it's more than that)?
Now suppose we have a class template like this:
template <int UnUsed>
class ConcreteError : public BaseError {
public:
ConcreteError () :开发者_开发问答BaseError(), error_msg() {}
ConcreteError (int errorCode, int osErrorCode, const std::string& errorMessage)
:BaseError(errorCode, osErrorCode, errorMessage){}
};
And then we setup few definitions, like this,
typedef ConcreteError<0> FileError;
typedef ConcreteError<1> NetworkError;
typedef ConcreteError<2> DatabaseError;
Since the template parameter int UnUsed
is not used in the implementation of class (just suppose that), so it seems that this situation is exactly same as multiple classes having exactly same memory layout (similar to the case of struct Point
and struct Size
), would there be duplicate code in the object file?
And what if we do like this,
typedef ConcreteError<0> FileError;
typedef ConcreteError<0> NetworkError;
typedef ConcreteError<0> DatabaseError;
Is this situation better, since now we're using same instantiated class in the typedefs?
PS: this class template code is taken from here :
How to create derived classes from a base class using template programming in C++?
Actually, I don't have any idea how compiler generates object file from source code, and how it handles class names, it's members, other symbols and all. How it handles typedefs? What does it do with this,
typedef int ArrayInt[100];
Is ArrayInt
a new type here? What code compiler creates for it in the object file? Where is 100
stored?
No single line from your examples will generate any code in the object file. Or, more precisely, it won't generate any data at all. I think "code" means just processor instructions.
The data in an object file is divided into three segments: code, static data and constant data.
The code is generated by actual function definitions (with function body, not just declarations) except for inline functions. Inline functions generate code each time they are actually used. Template functions generate code when they are instantiated, but multiple instantiations are usually optimized into single instances by either compiler, linker or both.
The static data is generated by defining global variables, static member variables (again, actual definitions and not just declarations inside a class) and static local variables. A variable must not be declared with const
modifier to go to the static data segment.
The constant data is generated by the same kinds of variable declarations as the static data, but with const
modifiers, plus floating-point literals plus string literals plus maybe more literals depending on the hardware platform. An OS may actually disallow write access to constant data on hardware level so your program may crash with access violation or segmentation fault if you try to write something there.
I'm not really an expert on such low-level things so I might have missed something, but I think I described the overall picture pretty well.
Unless I have really missed something, nothing else in a program generates any data in the object file, especially declarations and type definitions. These are used internally by the compiler. So when the compiler sees a struct definition, it remembers that it consists of two 32-bit integers. When it finds some real code that uses that struct, it knows that it must generate code that works with two 32-bit integers, must allocate at least 8 bytes to store it and so on. But all this information is used internally at the compile time and doesn't really go into the object file. If C++ had something like reflection, it would be another story.
Note that while defining a lot of structs add nothing to your object file, it may increase memory usage by the compiler itself. So you may say that defining identical things leads to data duplication at compile time but not at run time.
Firstly, no code is generated for the first structure definitions you included, so it's abit pointless to compare the two types. But in C++, type names are important, so a struct A
is definitely treated distinctly from a struct B
.
typedef
creates type aliases, so the typedef-ed type is indeed the original type (it doesn't create a different type).
ConcreteError<0>
is a different type than ConcreteError<1>
.
I don't think anything prevents the compiler from being funky and aliasing mangled function names to the same code when the parameters are identical in terms of data layout and the functions don't need to call other subfunctions on the data that are of different actual types and the functions do equivalent things to both types , but I don't think this is really done in practice. There are actually compilers which do thing (see Ben's comment below).
For the last typedef (all are aliases to ConcreteError<0>
) only one "version" of ConcreteError
is created (because only that one is instantiated).
No, no duplicate code with unused PODS. If you use them there will be two ints and possibly some padding allocated to them in memory. They will of course all look the same so what you want to call that is debatable but it's no more "duplication" than using the same type in two places.
No, no duplicate code with aliases. No code at all actually.
Maybe, depends on if the compiler uses certain optimizations.
Maybe. Depends on if your typedefs are being used in different translation units and how good your compiler is at removing duplicate instantiations.
No, it's an alias for int[100].
To a great degree the question of, "How much machine code results from this construct," is entirely dependent upon the implementation.
精彩评论