How does C/C++ know how long a dynamic allocated array is
This question has been bothering me for a while.
If I do int* a = new int[n]
, for example, I only have an pointer that points to the beginning of array a, but how does C/C++ know about n
? I know开发者_如何学运维 if I want to pass this array to another function, then I have to pass the length of the array with it, so I guess C/C++ does not really know how long this array is.
I know we can infer the end of a character array char*
by looking for the NUL terminator. But is there a similar mechanism for other arrays, like int? Meanwhile, char can be more than a character -- you can also treat it as an integer type. Then how does C++ know where this array ends then?
This question starts to bother me even more when I am developing embedded Python (If you are not familiar with embedded python, you may ignore this paragraph and just answer the above questions. I will still appreciate it). In Python there is a "ByteArray", and the only way to convert this "ByteArray" to C/C++ is to use PyString_AsString() to convert it to char*. But if this ByteArray has 0 in it, then C/C++ would think that char* array stops early. This is not the worst part. The worst part is, say I do a
char* arr = PyString_AsString(something)
void* pt = calloc(1, 1000);
if st happens to start with 0, then C/C++ will almost guarantee to wipe out everything in arr, since it thinks arr ends right after a NULL appears. Then it might just wipe out everything in arr by allocating a a trunk of memory to pt.
Thank you very much for your time! I really appreciate it.
C/C++ doesn't; it's the allocator (the little piece of code that implements malloc()
, free()
, etc.) that knows how long it is. C/C++ is welcome to wee all over itself, free of the constraints of having to worry about the length.
Also, PyString_AsStringAndSize()
.
Let's hit the disassembler! This is going to be different for C and C++. How free
works in C is covered in another question, and here's how it works in C++:
struct T {
~T();
int data;
};
void test(T* p)
{
delete[] p;
}
And let's run the compiler to produce assembly. Here's the relevant bits, compiled for i386:
movl -4(%edi), %eax
leal (%edi,%eax,4), %esi
cmpl %esi, %edi
je L4
.align 4,0x90
L8:
subl $4, %esi
movl %esi, (%esp)
call L__ZN1TD1Ev$stub
cmpl %esi, %edi
jne L8
You can see the important part: There is an integer stored before the start of p
containing the length of p
, and the code then loops over the p
array, calling the destructor for each item in the array. It then calls delete
, which is usually fairly boring because it just calls free
(the C function). So you can see how C++ delete
is expressed in terms of free
.
Destructors and Exceptions: Based on the above assembly, you can notice that if the destructor for T
threw an exception, then part of the p
array would get the destructor called and the rest of the array would not. Destructors should never throw exceptions.
Caveat: This is only one possible way that your compiler and runtime can solve this problem. (Here, the destructor is called by compiler-generated code and delete
is part of the runtime.) There is quite a bit of leeway in how these are implemented, and yours could be different. This also shows why you should always call the correct operator, delete[]
or delete
-- calling the wrong one will cause all sorts of trouble, such as stomping on memory and freeing invalid pointers.
About NUL terminators: The only reason NUL terminators are a problem is because PyString_AsString
and other similar functions call strlen
to figure out how long the string is. However, free
doesn't care about NUL terminators, instead, it keeps track of the length from the original malloc
call separately. For PyString_AsString
(and strdup
, etc.) this is not an option because there is no portable way to get the size of a region of memory -- malloc
and free
do not expose this functionality. Besides, you can pass a pointer to PyString_AsString
which is in the middle of a malloc
block or somewhere else entirely.
- See also: How does free know how much to free?
c/c++ doesn't know the length of any array, so you can cross-border access a array easily. c/c++ doesn't know the length of char array also.
Char* can point to string but it is is not equal to a string. String terminated by NULL is a convention of c/c++.
精彩评论