开发者

Pointer (trick): Memory Reference

A trick question about C pointer. Read code snippet below and try explain why the list value changed (this question was based on this code):

tail has the memory address of list.

How is possible list be changed below?

typedef struct _node {
   struct _node *next;
   int value;
}Node;


 int main(){
   Node *list, *node, *tail;
   int i = 100;

   list = NULL;
   printf("\nFirst . LIST value = %d", list);

  tail =(N开发者_运维知识库ode *) &list;
  node = malloc (sizeof (Node));
  node->next = NULL;
  node->value = i;

  //tail in this point contains the memory address of list
  tail->next = node;
  printf("\nFinally. LIST value = %d", list);
  printf("\nLIST->value = %d", (list->value));

return 0;

}

---- Output

First . List value = 0

why this values ??? im not expecting this ...

Finally . LIST value = 16909060

LIST->value = 100


Let's look at what happens to the memory in your program. You start with 3 local variables, all of type Node*. At the moment they all point to garbage, as they have been declared but not initialised.

An ascii art diagram of the memory might be (The layout is implementation dependant)

      list   node   tail
  --------------------------
... | 0xFE | 0x34 | 0xA3 | ...
  --------------------------

You then set list to NULL, and tail to the address of node (casting away its type, a bad idea), giving you

      list   node   tail
  --------------------------
... | NULL | 0xFE | &list | ...
  --------------------------
        ^             |
        +-------------+

You then malloc a new Node, setting list to its address.

      list    node   tail          next  value
  ---------------------------  ------------------
... | NULL | &next | &list | ... | NULL | 100 | ...
  ---------------------------  ------------------
        ^      |       |             ^
        |      +---------------------+
        +--------------+

You next try to set tail->next to node. You've said that you know tail points to a Node when you did the typecast, so the compiler believes you. The Node tail points to starts at list's address, like so

     tail                                list
     next    value                       next   value
  ----------------------------------  ------------------
... | NULL | &list->next | &list | ... | NULL | 100 | ...
  ----------------------------------  ------------------

You then set tail->next to node, making both list and node point to the list structure.

      list    node   tail          next  value
   ---------------------------  ------------------
... | &next | &next | &list | ... | NULL | 100 | ...
   ---------------------------  ------------------
       | ^     |       |             ^
       | |     +---------------------|
       | +-------------+             |
       +-----------------------------+

You've printed list as a signed integer ("%d"). This is a bad idea - if you are using a 64 bit machine and have other arguments in the printf statement they may be clobbered, use the pointer format ("%p") instead. list->value is the same as node->value, so it's still going to be 100.

Pointers become easier if you think about how they actually are represented in the machine - as an index to a huge array which holds all of your data (modulo pointer sizes, virtual memory etc.).

Next time it might be easier just to use list = node.


The following line is wrong:

tail =(Node *) &list; 

You take the address of the variable list, which is actually of type Node **. Then you cast it to a Node *. Although you can do this in C/C++, this is probably not want you intended.

To get the wanted behavior, tail should be of type Node **. So no casting is needed anymore, and at the end, you need to write (*tail)->next = node.


The reason tail has memory address of list is in this line

tail =(Node *) &list;

which means, assign the address of the pointer pointed to by list to the pointer variable tail.

And since tail and list both point to the same address, that is the basics of setting up the linked-list data structure.

Edit: Speaking of which, there is NO reference to Node as you have a struct _node declared... Amended this to take into account of the OP's code posting that left out Node....


The problem is in setting

tail = (Node*) &list

Thus list is a Node*, tail is a Node** , which is cast to Node*. Now here

tail->next == (*tail)+0 == (*&list)+0

thus

tail->next == list

Thus changing tail->next is the same as changing list.


The line

tail =(Node *) &list;

assigns the address of list to tail. Since &list is a Node **, the compiler doesn't like this assignment by default, so you add an explicit cast to silence it. Then

tail->next = node;

changes a member value in the struct supposedly pointed to by tail (at least the compiler believes it is a struct, since you explicitly told it so). Since next is the first member of the struct, its address is most likely the same as that of the struct itself. And since tail points to the address of list, in effect this assignment changes the value of list, which is a pointer to a _node. That is, it makes list point to node.

What you probably want is

Node** tail;
...
tail = &list;
...
(*tail)->next = node;

That is, declare tail as a pointer to pointer to _node, and add the extra indirection (*) when assigning a value through it.


By assigning the address of list to tail, you cause list and tail->next to refer to the same location in memory; when you assign to one, you clobber the other.

Let's start by looking at a hypothetical memory map of node after allocation and assignments (assuming 4 byte pointers and ints):

Object     Address      0x00  0x01  0x02  0x03
------     -------      ----  ----  ----  ----
node       0x08000004   0x10  0x00  0x00  0x00  // points to address 0x10000000
...
node.next  0x10000000   0x00  0x00  0x00  0x00  // points to NULL
node.value 0x10000004   0x00  0x00  0x00  0x64  // value = 100 decimal

When you write node->next = NULL, you're assigning NULL to memory location 0x10000000. IOW, the value of node corresponds to the address where node->next will be found.

Now let's look at a hypothetical layout of list, node, and tail after you've assigned list and tail

Object     Address      0x00  0x01  0x02  0x03
------     -------      ----  ----  ----  ----
list       0x08000000   0x00  0x00  0x00  0x00  // after list = NULL
node       0x08000004   0x10  0x00  0x00  0x00  // after node = malloc(sizeof *node);
tail       0x08000008   0x08  0x00  0x00  0x00  // after tail = (Node*) &list;

So now here's the memory map of tail after you've assigned tail->next:

Object     Address      0x00  0x01  0x02  0x03
------     -------      ----  ----  ----  ----
tail       0x08000008   0x08  0x00  0x00  0x00  // points to address 0x80000000,
...                                             // which is where list lives
tail.next  0x08000000   0x08  0x00  0x00  0x04  // points to node
tail.value 0x08000004   0x10  0x00  0x00  0x00  // value = some big number

Presto: list now contains the address of node.

Please for the love of God never do this in production code.


Isn't this line...

tail =(Node *) &list;

assigning to tail the address of the pointer to list, not the address of list?


If memory is changing unexpectedly the quickest way to track down the issue is to configure a breakpoint conditional on the memory change, including the size of the memory block of interest - 4 in this case, assuming it's a 32-bit platform pointer. In Windows (Visual Studio IDE or Windbg) this is easy to do - I have no info on other systems.

Usually you will find what's causing the issue very quickly this way.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜