开发者

0xDEADBEEF vs. NULL

Throughout various code开发者_C百科, I have seen memory allocation in debug builds with NULL...

memset(ptr,NULL,size);

Or with 0xDEADBEEF...

memset(ptr,0xDEADBEEF,size);
  1. What are the advantages to using each one, and what is the generally preferred way to achieve this in C/C++?
  2. If a pointer was assigned a value of 0xDEADBEEF, couldn't it still deference to valid data?


  1. Using either memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) is a clear indication of the fact that the author did not understand what they were doing.

    Firstly, memset(ptr, NULL, size) will indeed zero-out a memory block in C and C++ if NULL is defined as an integral zero.

    However, using NULL to represent the zero value in this context is not an acceptable practice. NULL is a macro introduced specifically for pointer contexts. The second parameter of memset is an integer, not a pointer. The proper way to zero-out a memory block would be memset(ptr, 0, size). Note: 0 not NULL. I'd say that even memset(ptr, '\0', size) looks better than memset(ptr, NULL, size).

    Moreover, the most recent (at the moment) C++ standard - C++11 - allows defining NULL as nullptr. nullptr value is not implicitly convertible to type int, which means that the above code is not guaranteed to compile in C++11 and later.

    In C language (and your question is tagged C as well) macro NULL can expand to (void *) 0. Even in C (void *) 0 is not implicitly convertible to type int, which means that in general case memset(ptr, NULL, size) is simply invalid code in C.

    Secondly, even though the second parameter of memset has type int, the function interprets it as an unsigned char value. It means that only one lower byte of the value is used to fill the destination memory block. For this reason memset(ptr, 0xDEADBEEF, size) will compile, but will not fill the target memory region with 0xDEADBEEF values, as the author of the code probably naively hoped. memset(ptr, 0xDEADBEEF, size) is eqivalent to memset(ptr, 0xEF, size) (assuming 8-bit chars). While this is probably good enough to fill some memory region with intentional "garbage", things like memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) still betray the major lack of professionalism on the author's part.

    Again, as other answer have already noted, the idea here is to fill the unused memory with a "garbage" value. Zero is certainly not a good idea in this case, since it is not "garbagy" enough. When using memset you are limited to one-byte values, like 0xAB or 0xEF. If this is good enough for your purposes, use memset. If you want a more expressive and unique garbage value, like 0xDEDABEEF or 0xBAADFOOD, you won't be able to use memset with it. You'll have to write a dedicated function that can fill memory region with 4-byte pattern.

  2. A pointer in C and C++ cannot be assigned an arbitrary integer value (other than a Null Pointer Constant, i.e. zero). Such assignment can only be achieved by forcing the integral value into the pointer with an explicit cast. Formally speaking, the result of such a cast is implementation defined. The resultant value can certainly point to valid data.


Writing 0xDEADBEEF or another non-zero bit pattern is a good idea to be able to catch both write-after-delete and read-after-delete uses.

1) Write after delete

By writing a specific pattern you can check if a block that has already been deallocated was written over later by buggy code; in our debug memory manager we use a free list of blocks and before recycling a memory block we check that our custom pattern are still written all over the block. Of course it's sort of "late" when we discover the problem, but still much earlier than when it would be discovered not doing the check. Also we have a special function that is called periodically and that can also be called on demand that just goes through the list of all freed memory blocks and check their consistency and so we can call this function often when chasing a bug. Using 0x00000000 as value wouldn't be as effective because zero may possibly be exactly the value that buggy code wants to write in the already deallocated block e.g. zeroing a field or setting a pointer to NULL (it's instead more unlikely that the buggy code wants to write 0xDEADBEEF).

2) Read after delete

Leaving the content of a deallocated block untouched or even writing just zeros will increase the possibility that someone reading the content of a dead memory block will still find the values reasonable and compatible with invariants (e.g. a NULL pointer as on many architectures NULL is just binary zeroes, or the integer 0, the ASCII NUL char or a double value 0.0). By writing instead "strange" patterns like 0xDEADBEEF most of code that will access in read mode those bytes will probably find strange unreasonable values (e.g. the integer -559038737 or a double with value -1.1885959257070704e+148), hopefully triggering some other self consistency check assertion.

Of course nothing is really specific to the bit pattern 0xDEADBEEF, actually we use different patterns for freed blocks, before-block area, after-block area and and also our memory manager writes another (address-dependent) specific bit pattern to the content part of any memory block before giving it to the application (this is to help finding uses of uninitialized memory).


I would definitely recommend 0xDEADBEEF. It clearly identifies uninitialized variables, and accesses to uninitialized pointers.

Being odd, dereferencing a 0xdeadbeef pointer will definitely crash on the PowerPC architecture when loading a word, and very likely crash on other architectures since the memory is likely to be outside the process' address space.

Zeroing out memory is a convenience since many structures/classes have member variables that use 0 as their initial value, but I would very much recommend initializing each member in the constructor rather than using the default memory fill. You will really want to be on top of whether or not you properly initialized your variables.


http://en.wikipedia.org/wiki/Hexspeak

These "magic" numbers are are a debugging aid to identify bad pointers, uninitialized memory etc. You want a value that is unlikely to occur during normal execution and something that is visible when doing memory dumps or inspecting variables. Initializing to zero is less useful in this regard. I would guess that when you see people initialize to zero it is because they need to have that value at zero. A pointer with a value of 0xDEADBEEF could point to a valid memory location so it's a bad idea to use that as an alternative to NULL.


One reason that you null the buffer or set it to a special value is that you can easily tell whether the buffer contents is valid or not in the debugger.

Dereferencing a pointer of value "0xDEADBEEF" is almost always dangerous(probably crashes your program/system) because in most cases you have no idea what is stored there.


DEADBEEF is an example of HexSpeek. With it, as a programmer you convey intentionally an error condition.


I would personally recommend using NULL (or 0x0) as it represents the NULL as expected and comes in handy while comparison. Imagine you are using char * and in between on DEADBEEF for some reason (don't know why), then at least your debugger will come very handy to tell you that its 0x0.


I would go for NULL because it's much easier to mass zero out memory than to go through later and set all the pointers to 0xDEADBEEF. In addition, there's nothing at all stopping 0xDEADBEEF from being a valid memory address on x86- admittedly, it would be unusual, but far from impossible. NULL is more reliable.

Ultimately, look- NULL is the language convention. 0xDEADBEEF just looks pretty and that's it. You gain nothing for it. Libraries will check for NULL pointers, they don't check for 0xDEADBEEF pointers. In C++ then the idea of the zero pointer isn't even tied to a zero value, just indicated with the literal zero, and in C++0x there is a nullptr and a nullptr_t.


Vote me down if this is too opinion-y for StackOverflow but I think this whole discussion is a symptom of a glaring hole in the toolchain we use to make software.

Detecting uninititialized variables by initializing memory with "garabage-y" values detects only some kinds of errors in some kinds of data.

And detecting uninititialized variables in debug builds but not for release builds is like following safety procedures only when testing an aircraft and telling the flying public to be satisfied with "well, it tested OK".

WE NEED HARDWARE SUPPORT for detecting uninitialized variables. As in something like an "invalid" bit that accompanies every addressability entity of memory (=byte on most of our machines) and which is set by the OS in every byte VirtualAlloc() (et. al, or equivalents on other OS's) hands over to applications and which is automatically cleared when the byte is written to but which causes an exception if read first.

Memory is cheap enough for this and processors are fast enough for this. This end of reliance on "funny" patterns and keeps us all honest to boot.


Note that the second argument in memset is supposed to be a byte, that is it is implicitely cast to a char or similar. 0xDEADBEEF would for most platforms convert to 0xEF (and something else for some odd platform).

Also note that the second argument is supposed to formally be an int which NULL isn't.

Now for the advantage of doing these kind of initialization. First of course the behavior would more likely be deterministic (even if we by this ends up in undefined behavior the behavior would in practice be consistent).

Having deterministic behavior will mean that debugging becomes easier, when you found a bug you would "only" have to provide the same input and the fault will manifest itself.

Now when you select which value you would use you should select a value that most likely will result in bad behavior - which means the use of uninitialized data would more likely result in a fault being observed. This means that you would have to use some knowledge of the platform in question (however many of them behave quite similar).

If the memory is used to hold pointers then indeed having cleared the memory will mean that you get a NULL pointer and normally dereferencing that will result in segmentation fault (which will be observed as a fault). However if you use it in another way, for example as an arithmetic type then you will get 0 and for many application that is not that odd number.

If you instead use 0xDEADBEEF you will get a quite large integer, also when interpreting the data as floating point it will also be quite large number (IIRC). If interpreting it as text it will be very long and contain non-ascii characters and if you use UTF-8 encoding it will likely be invalid. Now if used as a pointer on some platform it would fail alignment requirements for some types - also on some platforms that region of memory might be mapped out anyway (note that on x86_64 the value of the pointer would be 0xDEADBEEFDEADBEEF which is out of range for an address).

Note that while filling with 0xEF will have pretty much similar properties, if you want to fill the memory with 0xDEADBEEF you would need to use a custom function since memset doesn't do the trick.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜