Should one really set pointers to `NULL` after freeing them?
There seem to be two arguments why one should set a pointer to NULL
after freeing them.
Avoid crashing when double-freeing pointers.
Short: Calling free()
a second time, by accident, doesn't crash when it's set to NULL
.
Almost always this masks a logical bug because there is no reason to call
free()
a second time. It's safer to let the application crash and be able to fix it.It's not guaranteed to crash because sometimes new memory is allocated at th开发者_运维技巧e same address.
Double free occurs mostly when there are two pointers pointing to the same address.
Logical errors can lead to data corruption too.
Avoid reusing freed pointers
Short: Accessing freed pointers can cause data corruption if malloc()
allocates memory in the same spot unless the freed pointer is set to NULL
There's no guarantee that the program crashes when accessing the
NULL
pointer, if the offset is big enough (someStruct->lastMember
,theArray[someBigNumber]
). Instead of crashing there will be data corruption.Setting the pointer to
NULL
cannot solve the problem of having a different pointer with the same pointer value.
The questions
Here's a post against blindly setting a pointer to NULL
after freeing.
- Which one is harder to debug?
- Is there a possibility to catch both?
- How likely is it, that such bugs lead to data corruption instead of crashing?
Feel free to expand this question.
The second one is way more important: re-using a freed pointer can be a subtle error. Your code keeps right on working, and then crashes for no clear reason because some seemingly unrelated code wrote in the memory that the re-used pointer happens to be pointing at.
I once had to work on a really buggy program someone else wrote. My instincts told me that many of the bugs were related to sloppy attempts to keep using pointers after freeing the memory; I modified the code to set the pointers to NULL after freeing the memory, and bam, the null pointer exceptions started coming. After I fixed all the null pointer exceptions, suddenly the code was much more stable.
In my own code, I only call my own function that is a wrapper around free(). It takes a pointer-to-a-pointer, and nulls the pointer after freeing the memory. And before it calls free, it calls Assert(p != NULL);
so it still catches attempts to double-free the same pointer.
My code does other things too, such as (in the DEBUG build only) filling memory with an obvious value immediately after allocating it, doing the same right before calling free()
in case there is a copy of the pointer, etc. Details here.
EDIT: per a request, here is example code.
void
FreeAnything(void **pp)
{
void *p;
AssertWithMessage(pp != NULL, "need pointer-to-pointer, got null value");
if (!pp)
return;
p = *pp;
AssertWithMessage(p != NULL, "attempt to free a null pointer");
if (!p)
return;
free(p);
*pp = NULL;
}
// FOO is a typedef for a struct type
void
FreeInstanceOfFoo(FOO **pp)
{
FOO *p;
AssertWithMessage(pp != NULL, "need pointer-to-pointer, got null value");
if (!pp)
return;
p = *pp;
AssertWithMessage(p != NULL, "attempt to free a null FOO pointer");
if (!p)
return;
AssertWithMessage(p->signature == FOO_SIG, "bad signature... is this really a FOO instance?");
// free resources held by FOO instance
if (p->storage_buffer)
FreeAnything(&p->storage_buffer);
if (p->other_resource)
FreeAnything(&p->other_resource);
// free FOO instance itself
free(p);
*pp = NULL;
}
Comments:
You can see in the second function that I need to check the two resource pointers to see if they are not null, and then call FreeAnything()
. This is because of the assert()
that will complain about a null pointer. I have that assert in order to detect an attempt to double-free, but I don't think it has actually caught many bugs for me; if you want to leave out the asserts, then you can leave out the check and just always call FreeAnything()
. Other than the assert, nothing bad happens when you try to free a null pointer with FreeAnything()
because it checks the pointer and just returns if it was already null.
My actual function names are rather more terse, but I tried to pick self-documenting names for this example. Also, in my actual code, I have debug-only code that fills buffers with the value 0xDC
before calling free()
so that if I have an extra pointer to that same memory (one that doesn't get nulled out) it becomes really obvious that the data it's pointing to is bogus data. I have a macro, DEBUG_ONLY()
, which compiles to nothing on a non-debug build; and a macro FILL()
that does a sizeof()
on a struct. These two work equally well: sizeof(FOO)
or sizeof(*pfoo)
. So here is the FILL()
macro:
#define FILL(p, b) \
(memset((p), b, sizeof(*(p)))
Here's an example of using FILL()
to put the 0xDC
values in before calling:
if (p->storage_buffer)
{
DEBUG_ONLY(FILL(pfoo->storage_buffer, 0xDC);)
FreeAnything(&p->storage_buffer);
}
An example of using this:
PFOO pfoo = ConstructNewInstanceOfFoo(arg0, arg1, arg2);
DoSomethingWithFooInstance(pfoo);
FreeInstanceOfFoo(&pfoo);
assert(pfoo == NULL); // FreeInstanceOfFoo() nulled the pointer so this never fires
I don't do this. I don't particularly remember any bugs that would have been easier to deal with if I did. But it really depends on how you write your code. There are approximately three situations where I free anything:
- When the pointer holding it is about to go out of scope, or is part of an object which is about to go out of scope or be freed.
- When I am replacing the object with a new one (as with reallocation, for instance).
- When I am releasing an object which is optionally present.
In the third case, you set the pointer to NULL. That's not specifically because you're freeing it, it's because the whatever-it-is is optional, so of course NULL is a special value meaning "I haven't got one".
In the first two cases, setting the pointer to NULL seems to me to be busy work with no particular purpose:
int doSomework() {
char *working_space = malloc(400*1000);
// lots of work
free(working_space);
working_space = NULL; // wtf? In case someone has a reference to my stack?
return result;
}
int doSomework2() {
char * const working_space = malloc(400*1000);
// lots of work
free(working_space);
working_space = NULL; // doesn't even compile, bad luck
return result;
}
void freeTree(node_type *node) {
for (int i = 0; i < node->numchildren; ++i) {
freeTree(node->children[i]);
node->children[i] = NULL; // stop wasting my time with this rubbish
}
free(node->children);
node->children = NULL; // who even still has a pointer to node?
// Should we do node->numchildren = 0 too, to keep
// our non-existent struct in a consistent state?
// After all, numchildren could be big enough
// to make NULL[numchildren-1] dereferencable,
// in which case we won't get our vital crash.
// But if we do set numchildren = 0, then we won't
// catch people iterating over our children after we're freed,
// because they won't ever dereference children.
// Apparently we're doomed. Maybe we should just not use
// objects after they're freed? Seems extreme!
free(node);
}
int replace(type **thing, size_t size) {
type *newthing = copyAndExpand(*thing, size);
if (newthing == NULL) return -1;
free(*thing);
*thing = NULL; // seriously? Always NULL after freeing?
*thing = newthing;
return 0;
}
It's true that NULL-ing the pointer can make it more obvious if you have a bug where you try to dereference it after freeing. Dereferencing probably does no immediate harm if you don't NULL the pointer, but is wrong in the long run.
It's also true that NULL-ing the pointer obscures bugs where you double-free. The second free does no immediate harm if you do NULL the pointer, but is wrong in the long run (because it betrays the fact that your object lifecycles are broken). You can assert things are non-null when you free them, but that results in the following code to free a struct which holds an optional value:
if (thing->cached != NULL) {
assert(thing->cached != NULL);
free(thing->cached);
thing->cached = NULL;
}
free(thing);
What that code tells you, is that you've got in too far. It should be:
free(thing->cached);
free(thing);
I say, NULL the pointer if it's supposed to remain usable. If it isn't usable any more, best not to make it falsely appear to be, by putting in a potentially-meaningful value like NULL. If you want to provoke a page fault, use a platform-dependent value which isn't dereferancable, but which the rest of your code won't treat as a special "everything is fine and dandy" value:
free(thing->cached);
thing->cached = (void*)(0xFEFEFEFE);
If you can't find any such constant on your system, you may be able to allocate a non-readable and/or non-writeable page, and use the address of that.
If you don't set the pointer to NULL there is a not-so-small chance, that your application continues to run in an undefined state and crashes later on at a completely unrelated point. Then you will spend a lot of time with debugging a nonexistent error before you find out, that it's a memory-corruption from earlier.
I'd set the pointer to NULL because the chances are higher that you'll hit the correct spot of the error earlier than if you didn't set it to NULL. The logical error of freeing memory a second time is still to be thought of and the error that your application does NOT crash on null-pointer access with a large enough offset is in my opinion completely academic although not impossible.
Conclusion: I'd go for setting the pointer to NULL.
The answer depends on (1) project size, (2) expected lifetime of your code, (3) team size. On a small project with a short lifetime, you can skip setting pointers to NULL, and just debug along.
On a big, long-lived project, there are good reasons to set pointers to NULL: (1) Defensive programming is always good. Your code might be ok, but the beginner next door might still struggle with pointers (2) My personal belief is, that all variable should contain only valid values at all times. After a delete / free, the pointer is not a valid value anymore, so it needs to be removed from that variable. Replacing it by NULL (the only pointer value that's always valid) is a good step. (3) Code never dies. It always gets reused, and often in ways that you haven't imagined at the time you wrote it. Your code segment might end up being compiled in a C++ context, and probably get moved to a destructor or a method that gets called by a destructor. The interactions of virtual methods and objects that are in the process of being destructed are subtle traps even for very experienced programmers. (4) If your code ends up being used in a multi-threaded context, some other thread might read that variable and try to access it. Such contexts often arise when legacy code is wrapped and reused in a web server. So an even better way of freeing memory (from a paranoid standpoint) is to (1) copy the pointer to a local variable, (2) set the original variable to NULL, (3) delete/free the local variable.
If the pointer is going to be reused then is should be set back to 0(NULL) after use even if the object it was pointing is not freed from the heap. This allows for valid checking against NULL like if (p){ //do something}. Also just because you free an object whose address the pointer is pointing to doesn't mean the pointer is set 0 after calling on delete keyword or free function at all.
If the pointer is used once and it is part of a scope which makes it local, then there is not need to set it to NULL being that it will be disposed of from the stack after the function returns.
If the pointer is a member (struct or class), you should set it to NULL after freeing the object or objects on a double pointer again for valid checking against NULL.
Doing this will help you alleviate the headaches from invalid pointers like '0xcdcd...' and so on. So if the pointer is 0 then you know it is not pointing to an address and can make sure that the object is released from the heap.
Both are very important since they deal with undefined behaviour. You should not leave any ways to undefined behaviour in your program. Both can lead to crashes, corrupting data, subtle bugs, any other bad consequences.
Both are quite difficult to debug. Both can't be avoided for sure, especially in case of complex data structures. Anyway you're much better off if you follow the following rules:
- always initialize pointers - set them to NULL or some valid address
- after you call free() set the pointer to NULL
- check any pointers that can possibly be NULL for actually being NULL before dereferencing them.
In C++ could catch both by implementing your own smart pointer (or deriving from existing implementations) and implementing something like:
void release() {
assert(m_pt!=NULL);
T* pt = m_pt;
m_pt = NULL;
free(pt);
}
T* operator->() {
assert(m_pt!=NULL);
return m_pt;
}
Alternatively, in C you could at least provide two macros to the same effect:
#define SAFE_FREE(pt) \
assert(pt!=NULL); \
free(pt); \
pt = NULL;
#define SAFE_PTR(pt) assert(pt!=NULL); pt
These problems are most often only symptoms for a much deeper problem. This can occur for all resources that require aquisition and a later release, e.g. memory, files, databases, network connections, etc. The core problem is that you've lost track of the resource allocations by a missing code structure, throwing random mallocs and frees all over the code base.
Organize the code around DRY - Don't Repeat Yourself. Keep related things together. Do one thing only, and do it good. The "module" that allocates a resource is responsible for releasing it and has to provide a function for doing so that keeps care for the pointers, too. For any specific resource, you then have exactly one place where it is allocated and one place where it is released, both close together.
Say you want to split a string into substrings. Directly using malloc(), your function has to care for everything: Analysing the string, allocate the right amount of memory, copy the substrings there, and and and. Make the function complicated enough, and it is not the question if you will lose track of the resources, but when.
Your first module takes care for the actual memory allocation:
void *MemoryAlloc (size_t size)
void MemoryFree (void *ptr)
There's your only place in your whole codebase where malloc() and free() are called.
Then we need to allocate strings:
StringAlloc (char **str, size_t len)
StringFree (char **str)
They take care that len+1 is needed and that the pointer is set to NULL when freed. Provide another function to copy a substring:
StringCopyPart (char **dst, const char *src, size_t index, size_t len)
It will take care if index and len are inside the src string and modify it when needed. It will call StringAlloc for dst, and it will care that dst is correctly terminated.
Now you can write your split function. You don't have to care for the low level details anymore, just analyse the string and get the substrings out of it. Most of the logic is now in the module where it belongs, instead of mixed together in one large monstrosity.
Of course this solution has its own problems. It provides abstraction layers, and each layer, while solving other problems, comes with its own set of them.
There isn't really a "more important" part to which of the two problems you're trying to avoid. You really, really need to avoid both if you want to write reliable software. It's also very likely that either of the above will lead to data corruption, having your webserver pwned and other fun along those lines.
There is also another important step to keep in mind - setting the pointer to NULL after free'ing it is only half the job. Ideally, if you are using this idiom, you should also wrap pointer access in something like this:
if (ptr)
memcpy(ptr->stuff, foo, 3);
Just setting the pointer itself to NULL will only have the program crash in inopportune places, which is probably better than silently corrupting data but it's still not what you want.
There's no guarantee that the program crashes when accessing the NULL pointer.
Maybe not by the standard, but you'd be hard-pressed to find an implementation which does not define it as an illegal operation that causes a crash or exception (as appropriate to the runtime environment).
精彩评论