Using NULL as a terminator in arrays?
I like "reinventing the wheel" for learning purposes, so I'm working on a container class for strings. Will using the NULL
character as an array terminator (i.e., the last value in the array will be NULL
) cause interference with the null-terminated strings?
I think it would only be an issue开发者_运维问答 if an empty string is added, but I might be missing something.
EDIT: This is in C++.
""
is the empty string in C and C++, not NULL
. Note that ""
has exactly one element (instead of zero), meaning it is equivalent to {'\0'}
as an array of char
.
char const *notastring = NULL;
char const *emptystring = "";
emptystring[0] == '\0'; // true
notastring[0] == '\0'; // crashes
No, it won't, because you won't be storing in an array of char, you'll be storing in an array of char*.
char const* strings[] = {
"WTF"
, "Am"
, "I"
, "Using"
, "Char"
, "Arrays?!"
, 0
};
It depends on what kind of string you're storing.
If you're storing C-style strings, which are basically just pointers to character arrays (char*
), there's a difference between a NULL
pointer value, and an empty string. The former means the pointer is ‘empty’, the latter means the pointer points to an array that contains a single item with character value 0 ('\0'
). So the pointer still has a value, and testing it (if (foo[3])
) will work as expected.
If what you're storing are C++ standard library strings of type string
, then there is no NULL
value. That's because there is no pointer, and the string
type is treated as a single value. (Whereas a pointer is technically not, but can be seen as a reference.)
I think you are confused. While C-strings are "null terminated", there is no "NULL" character. NULL
is a name for a null pointer. The terminator for a C-string is a null character, i.e. a byte with a value of zero. In ASCII, this byte is (somewhat confusingly) named NUL
.
Suppose your class contains an array of char
that is used to store the string data. You do not need to "mark the end of the array"; the array has a specific size that is set at compile-time. You do need to know how much of that space is actually being used; the null-terminator on the string data accomplishes that for you - but you can get better performance by actually remembering the length. Also, a "string" class with a statically-sized char buffer is not very useful at all, because that buffer size is an upper limit on the length of strings you can have.
So a better string class would contain a pointer of type char*
, which points to a dynamically allocated (via new[]
) array of char
s. Again, it makes no sense to "mark the end of the array", but you will want to remember both the length of the string (i.e. the amount of space being used) and the size of the allocation (i.e. the amount of space that may be used before you have to re-allocate).
When you are copying from std::string
, use the iterators begin()
, end()
and you don't have to worry about the NULL - in reality, the NULL is only present if you call c_str()
(in which case the block of memory this points to will have a NULL to terminate the string.) If you want to memcpy
use the data()
method.
Why don't you follow the pattern used by vector
- store the number of elements within your container class, then you know always how many values there are in it:
vector<string> myVector;
size_t elements(myVector.size());
Instantiating a string with x
where const char* x = 0;
can be problematic. See this code in Visual C++ STL that gets called when you do this:
_Myt& assign(const _Elem *_Ptr)
{ // assign [_Ptr, <null>)
_DEBUG_POINTER(_Ptr);
return (assign(_Ptr, _Traits::length(_Ptr)));
}
static size_t __CLRCALL_OR_CDECL length(const _Elem *_First)
{ // find length of null-terminated string
return (_CSTD strlen(_First));
}
#include "Maxmp_crafts_fine_wheels.h"
MaxpmContaner maxpm;
maxpm.add("Hello");
maxpm.add(""); // uh oh, adding an empty string; should I worry?
maxpm.add(0);
At this point, as a user of MaxpmContainer who had not read your documentation, I would expect the following:
strcmp(maxpm[0],"Hello") == 0;
*maxpm[1] == 0;
maxpm[2] == 0;
Interference between the zero terminator at position two and the empty string at position one is avoided by means of the "interpret this as a memory address" operator *. Position one will not be zero; it will be an integer, which if you interpret it as a memory address, will turn out to be zero. Position two will be zero, which, if you interpret it as a memory address, will turn out to be an abrupt disorderly exit from your program.
精彩评论