Is returning a forward-declared structure undefined behavior?
I have the following code (include-guards omitted for simplicity's sake):
= foo.hpp
=
struct FOO
{
int not_used_in_this_sample;
int not_used_in_开发者_如何学Gothis_sample2;
};
= main.cpp
=
#include "foo_generator.hpp"
#include "foo.hpp"
int main()
{
FOO foo = FooGenerator::createFoo(0xDEADBEEF, 0x12345678);
return 0;
}
= foo_generator.hpp
=
struct FOO; // FOO is only forward-declared
class FooGenerator
{
public:
// Note: we return a FOO, not a FOO&
static FOO createFoo(size_t a, size_t b);
};
= foo_generator.cpp
=
#include "foo_generator.hpp"
#include "foo.hpp"
FOO FooGenerator::createFoo(size_t a, size_t b)
{
std::cout << std::hex << a << ", " << b << std::endl;
return FOO();
}
This code, as it stands, compiles perfectly fine without any warning. If my understanding is correct, it should output:
deadbeef, 12345678
But instead, it randomly displays:
12345678, 32fb23a1
Or just crashes.
If I replace the forward-declaration of FOO in foo_generator.hpp
with #include "foo.hpp"
, then it works.
So here is my question: Does returning a forward-declared structure lead to undefined behavior ? Or what can possibly go wrong ?
Compiler used: MSVC 9.0 and 10.0 (both show the issue)
That should be fine according to 8.3.5.6: "The type of a parameter or the return type for a function declaration that is not a definition may be an incomplete class type."
I guess I got the same problem. It happens with small return value types and the order of headers inclusion matters. To avoid it don't use return value type forward declaration or include headers in the same order.
For a possible explanation look at this:
func.h
struct Foo;
Foo func();
func.cpp
#include "func.h"
#include "foo.h"
Foo func()
{
return Foo();
}
foo.h
struct Foo
{
int a;
};
Notice that whole Foo fits in a single CPU register.
func.asm (MSVS 2005)
$T2549 = -4 ; size = 4
___$ReturnUdt$ = 8 ; size = 4
?func@@YA?AUFoo@@XZ PROC ; func
; 5 : return Foo();
xor eax, eax
mov DWORD PTR $T2549[ebp], eax
mov ecx, DWORD PTR ___$ReturnUdt$[ebp]
mov edx, DWORD PTR $T2549[ebp]
mov DWORD PTR [ecx], edx
mov eax, DWORD PTR ___$ReturnUdt$[ebp]
When func() is declared Foo's size is unknown. It doesn't know how Foo could be returned. So func() expects pointer to return value storage as its parameter. Here it's _$ReturnUdt$. Value of Foo() is copied there.
If we change headers order in func.cpp we get:
func.asm
$T2548 = -4 ; size = 4
?func@@YA?AUFoo@@XZ PROC ; func
; 5 : return Foo();
xor eax, eax
mov DWORD PTR $T2548[ebp], eax
mov eax, DWORD PTR $T2548[ebp]
Now compiler knows that Foo is small enough so it is returned via register and no extra parameter needed.
main.cpp
#include "foo.h"
#include "func.h"
int main()
{
func();
return 0;
}
Notice that here Foo's size is known when func() is declared.
main.asm
; 5 : func();
call ?func@@YA?AUFoo@@XZ ; func
mov DWORD PTR $T2548[ebp], eax
; 6 : return 0;
So compiler assumes func() will return value through register. It doesn't pass a pointer to temp location to store return value. But if func() expects the pointer it writes to memory corrupting the stack.
Let's change headers order so func.h goes first.
main.asm
; 5 : func();
lea eax, DWORD PTR $T2548[ebp]
push eax
call ?func@@YA?AUFoo@@XZ ; func
add esp, 4
; 6 : return 0;
Compiler passes the pointer that func() expects so no stack corruption results.
If Foo's size were bigger than 2 integers compiler would always pass the pointer.
It works fine for me under GCC. I don't know why it wouldn't, since foo.hpp
is included before foo_generator.hpp
.
精彩评论