开发者

C: Not explicitly returning constructed struct, but it still works

I write a 'constructor' function that makes a Node in C, compiled with Visual Studio 2008, ANSI C mode.

#include <stdio.h>
#inc开发者_JAVA百科lude <stdlib.h>

typedef struct _node
{
  struct _node* next ;
  char* data ;
} Node ;

Node * makeNode()
{
  Node * newNode = (Node*)malloc( sizeof(Node) ) ;

  // uncommenting this causes the program to fail.
  //puts( "I DIDN'T RETURN ANYTHING!!" ) ;
}

int main()
{
  Node * myNode = makeNode() ;
  myNode->data = "Hello there" ;

  // elaborate program, still works

  puts( myNode->data ) ;

  return 0 ;
}

What's surprising to me :

  • * Not returning a value from makeNode() is only a warning,
  • * More surprising is makeNode() __still works__ as long as I don't puts() anything!

What's going on here and is it "ok" to do this (not return the object you create in a C 'constructor' function?)

WHY is it still working? Why does the puts() command cause the program to fail?


The reason that not returning anything is a warning and not an error is probably largely historical. In 'traditional' C, functions didn't need to declare their return type which just defaulted to int. Some functions were written with no explicit return type and didn't return anything, others chose to return something meaningful but still didn't declare a return type. Trying to tighten up return statements or lack thereof would have meant breaking a lot of old-style code.

It might happen to work, but what you are seeing is dependent on things which aren't guaranteed.

What's probably happening is that the return value of a function goes into a particular register. After you call malloc, if you do nothing else and fall of the end of the function, what is returned by malloc appears to be returned by your function as the result is still sitting in the return register after that function call.

If you call some other function, the return value of malloc is lost and what is returned by your function is whatever happened to end up in the return register.


In your case, and almost always on x86, the return value is passed via the register EAX.

When you comment out puts(), no code intervenes between the assignment to newNode and the return of your function. This results in pseudo-assembly code such as:

push sizeof(Node)
call malloc              ; allocate memory, places return value in eax
move [newNode], eax      ; make use of return value from malloc
ret                      ; return from your function -- 
                         ; eax still contains malloc() return value

Therefore eax register is not changed and your function appears to return the pointer.

Compare with:

push sizeof(Node)
call malloc              ; allocate memory, places return value in eax
move [newNode], eax      ; make use of return value from malloc
push DWORD PTR ["I DIDN'T RETURN ANYTHING!!"]
call puts                ; during this subroutine, eax will be changed
ret                      ; return from your function -- 
                         ; eax contains garbage value left over from puts()


It probably has to do with the last thing on the stack. Without puts(), the last thing on the stack is the node you allocated, and it gets returned. With puts(), the last thing on the stack is the return value of puts(), which is an int, and is returned and used as a pointer, which is probably bad.

Note that, either way, your program is wrong. This is undefined behavior (or some siliarly scary sounding standardese) and shouldn't be relied upon. Make your function work right all the time - don't rely on undefined behavior.

If you want to find out if this is true, you can do this:

Node * makeNode()
{
  Node * newNode;

  puts( "I DIDN'T RETURN ANYTHING!!" ) ;

  newNode = (Node*)malloc( sizeof(Node) ) ;
}

Will probably work the same as the one that doesn't puts(). This will also probably work:

Node * makeNode()
{
  malloc( sizeof(Node) ) ;
}

But you really shouldn't be using any of these. They work by happenstance, and if you were really lucky none of them would work.

Also note that some people (including me) consider the typecasting of the return value of malloc() to be a bad idea, but that is a debatable subject.


My guess is that makeNode() stores the address of the allocated object in eax, which happens to be the register used to return pointers on x86.

And calling puts() probably happens to modify eax's value.

If you show the disassembly, we can tell you what really happens.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜