What do I need to know about memory in C++?
I've been doing my best to learn C++ but my previous training will fall short in one major issue: memory management. My primary languages all have automatic garbage collection, so keeping track of everything has never really been necessary. I've tried reading up on memory management in C++ online, but I have this shaking suspicion that I am strill missing something.
So, here's a multi-part question:
- What is the b开发者_JAVA百科are minimum I need to know about memory management? (or, where do I go to find that out)?
- Where do I go for intermediate and advanced knowledge/tutorials/etc (once I am done with the basics)? More specifically:
- What is the performance difference between pointers and references?
- I've heard that in loops, you need to make sure that you call
delete
on any new pointers before the loop re-iterates. Is this correct? Do you need to do something with references? - What are some classic examples of memory leaks?
- What do I need to know about the following (and will I ever realistically need to use them -- if so, where?):
malloc
free
calloc
realloc
*********************** UPDATE *******************
This is to address a reference to lmgtfy in comment one (by Ewan). If you start reading the information which is available there, it is not useful to the beginner. It is great theory, I think, but it is neither pertinent or useful to this question.
You really, really need to read a good book - learning C++ frankly is not possible without one. I recommend Accelerated C++, by Koenig & Moo, two of the originators of C++.
Memory Management
Basics
- Every 'use of' new must be matched by 'use of' delete
- Array new is different form normal new and has its own delete
_
int* data1 = new int(5);
delete data1;
int* data2 = new int[5];
delete [] data2;
Must Know
- Exceptions
- RAII
- Rule of 4.
throwing exceptions out of a destructor
Dynamically allocating an array of objects
Pattern name for create in constructor, delete in destructor (C++)
Best Practice
- Never actually use RAW pointers.
- Always wrap pointers in Smart Pointers.
- Learn the different types of smart pointers and when to use each
Smart Pointers: Or who owns you baby?
Advanced:
- Understanding Exception Guarantees
- Understanding the use of throw clause
What are the principles guiding your exception handling policy?
Common ways to leak
Basics
// Every new is matched by a delete.
for(int loop = 0;loop < 10;++loop)
{
data = new int(5);
}
delete data;
// The problem is that every 'use of' new is not matched by a delete.
// Here we allocate 10 integers but only release the last one.
Must Know
class MyArray
{
// Use RAII to manage the dynamic array in an exception safe manor.
public:
MyArray(int size)
:data( new int[size])
{}
~MyArray()
{
delete [] data;
}
// PROBLEM:
// Ignored the rule of 4.
// The compiler will generate a copy constructor and assignment operator.
// These default compiler generated methods just copy the pointer. This will
// lead to double deletes on the memory.
private:
int* data;
};
Best Practice
// Understand what the properties of the smart pointers are:
//
std::vector<std::auto_ptr<int> > data;
// Will not work. You can't put auto_ptr into a standard container.
// This is because it uses move semantics not copy semantics.
Advanced:
// Gurantee that exceptions don't screw up your object:
//
class MyArray
{
// ... As Above: Plus
void resize(int newSize)
{
delete [] data;
data = new int[newSize];
// What happens if this new throws (because there is not enough memory)?
// You have deleted the old data so the old data so it points at invalid memory.
// The exception will leave the object in a completely invalid state
}
What you need to know about memory management in the simplest sense, is that you need to delete the memory that you allocate on the heap. So when creating an object like MyClass *myClass = new MyClass(x);
you need to have some place in your code that frees/deletes this with a corresponding delete
. This appears easy in practice, but without a proper design and the use of helper objects such as shared pointers, this can quickly get messy especially as code is maintained and features added. For example here is a classic memory leak:
try
{
MyClass *myClass = new MyClass(x);
// Do some stuff can throw an exception
delete myClass;
}
catch(...)
{
// Memory leak on exceptions. Delete is never called
}
OR another big memory management gotcha is calling the wrong type of delete:
int* set = new int[100];
delete set; // Incorrect - Undefined behavior
// delete [] set; is the proper way to delete an array of pointers
A common way to help yourself is to use the RAII idiom. (Resource Allocation Is Initialization)
Here is an example of using the std library in order to prevent a memory leak:
try
{
auto_ptr<MyClass> myClass(new MyClass(x));
// Now the heap allocated memory associated with myClass
// will automatically be destroyed when it goes out of scope,
// but you can use it much like a regular pointer
myClass->memberFunction();
}
catch (...)
{
}
More info on auto_ptr
can be found here. If you can use C++11, shared_ptr
is a highly recommended choice and is often preferred over auto_ptr.
First, you should understand the concepts of the stack and the heap.
After you understand these concepts, proceed to learning the language constructs.
What is the bare minimum I need to know about memory management? (or, where do I go to find that out)?
For every new, there must be a delete
Where do I go for intermediate and advanced knowledge/tutorials/etc (once I am done with > the basics)?
Read Effective C++, More Effective C++, and Effective STL. Then google (std::)auto_ptr, (boost::)scoped_ptr, and (boost::)shared_ptr
More specifically:What is the performance difference between pointers and references?
I do not know off the top of my head, by since a reference is a copy of a pointer value, I don't foresee any large performance issues.
I've heard that in loops, you need to make sure that you call delete on any new pointers before the loop re-iterates. Is this correct?
Yes.
Do you need to do something with references?
No.
What are some classic examples of memory leaks?
int * foo() {
...
return new int(...);
}
int main() {
int i = *foo();
...
//the new int() from foo leaks
}
What do I need to know about the following (and will I ever realistically need to use them -- if so, where?):
First of all, you should never delete
a malloc
ed pointer and never free
a pointer created with new
. In general, these functions should not appear in c++ code. However, if you find yourself in c-land...
malloc : Similar to new (allocates memory on the heap)
free : Similar to delete (free memory on the heap)
calloc : Similar to new + memset (allocates memory on the heap, sets it to zero)
realloc: Attempts to resize a block of memory, or creates a new block block of memory and copies the old data, free
ing the old pointer. No real c++ equivalent.
Some neat memory stuff can be found by googleing (is that how it's spelled?) placement new
You should look into smart pointers, they make your life a lot easier when it comes to memory management.
Book called "Memory as a Programming Concept in C and C++" is a very good read for someone who is new to C/C++.
From your list, you've missed new
and delete
- some say never to use malloc
and free
.
Also the oft forgotten delete[]
.
Wow, this is a lot to tackle.
The most important thing is to be consistently diligent and disciplined. This is true with any resource in any language, even safer managed language. People feel that when a language manages their memory for them, they don't have to think about it. But it's always best to release any resources as quickly as possible after your finished with them. I've always felt "Garbage Collection" has made programmers lazy in recent years.
When ever I allocate memory in a language like c++, I make sure that I deallocate it first, before using it. In other words, I write the allocate, and deallocate and afterwards fill in the middle. It is important to get into a consistent habit. I think that's the bare minimum to learn... proper and disciplined management of resources. That's not just about memory, it should be applied to all resources, including database references, file references, context handles, and other such animals.
The entire subject of memory management in C++ is fairly vast. I would say read, study, and code as much as possible.
example:
char* myusedmemory;
myusedmemory = (char *)malloc(1000); // allocate memory
free(myusedmemory); // immediately deallocate memory
/* go back and fill in the code between */
There are plenty of good references to go for additional knowledge on the subject. I found going through the tutorials on relisoft.com was helpful for me, though the main tutorial on there is Windows specific. Another good reference can be found here.
As far as differences between pointers and references, one of the main differences is flexibility. You have to define the reference immediately ( int iExample; int& refExample = iExample; ) I wouldn't think there would be much of a performance difference. However, pointers being more powerful and more flexible will be more dangerous, and will require the aforementioned discipline to manage.
examples of memory leaks are here. but you can find more by googling "memory leaks in C++"
As far as the malloc, free, calloc, realloc, these are just functions like any other commands, in these particular cases, functions included in the stdlib. You should just have an understanding of what they do and how to use them, just like you would with any other functions, just like the common printf().
as a note: Smart pointers are a very good way to go and are generally safer.
as another note, I wanted to mention, Code Complete, the best book I've read on the subject of resource management. I've read it cover to cover many, many times.
In other languages you already have to keep track of database connections, window handles, sockets, etc. with mechanisms such as "finally" (in Java) or "using" (in C#). In C++, just add memory to that list. It's not really conceptually any different.
Here's something that often catches students: Large, really large objects such as arrays, should be allocated in dynamic memory (i.e. using new
).
Also, don't pass around large objects. Pass pointers, preferable smart pointers, to the objects. Copying large objects consumes a lot of processor time.
Set up and document rules about object allocation and ownership. Does the callee or the caller own the object?
Don't return references to local objects, nor pointers to local objects.
Learn about RAII. Some people here pointed it out, but at the same time explained the new/delete stuff, without stressing importance of RAII. With RAII you don't have to worry about memory management.
People new to C++, tend to code like in Java puting "new" everywhere. This is in many cases a bad habit, in some cases you can't avoid it(but from the experience on my projects, this is most likely never).
Just adding this comment to stress it ;) However all the comments are perfectly right.
Everyone is mentioning new and delete, but most of the time, you don't need and shouldn't use them explicitely:
- The best thing is to use standard containers and let the do the memory management.
- When that is not possible, use smart pointers with either reference counting or, in last resort, smarter garbage collection.
Of course for performance reasons you might want to have an occasional new and delete pair in a performance critical, but that should be the exception rather than the rule.
new
and delete
are the two most important keywords for memory management. And at its simplest you just need to remember to call delete
for every object that you call new
on. Therefore if you call new
in a loop, you'll need to make sure that you call delete
on each of those new
'ed objects. You don't need to do it from within the loop so long as you save a copy of each pointer somewhere that can be deleted later.
malloc
, free
, calloc
, and realloc
are all probably more advanced than what you need to worry about. I guess just remember they are there if the standard new
/delete
ever feels limiting.
That all said, smart pointers can be a big help, but sometimes its helpful to know how to do stuff the hard way before tackling smart pointers.
As I've been learning C++, I've found that using a memory analysis tool like Valgrind is indispensable in helping to find memory leaks. When you run your program (compiled with debug symbols) from Valgrind, it will identify the lines where memory gets allocated but is never deallocated later.
I use these command line arguments:
valgrind --leak-check=yes --num-callers=8 ./myExecutable
Note that your program will run much slower than when run on its own, but it's often worth the effort.
精彩评论