STL containers memory issue [duplicate]
I'm implementing my own graph library in linux (Fedora 10 and CentOS 5) with gcc 4.3.2 and using STL containers then I found some problems with memory. When I build my graph, I use a lot of memory enough to be view in top or another memory usage tool. I'm sure that I'm deallocating that memory (I reviewed the code again and again and I used valgrind to check for memory leak), but the memory remains in use (I can view this in top or cat /proc/meminfo) and when I create the graph once again, it does not increase memory usage, apparently reusing the allocated memory.
After several days of debugging, I created a very simple code that has the same problem.
#include <iostream>
#include <list>
// Object that occupies 128KB.
// Data is not important.
class MyObject
{
public:
int * a;
int * b;
int * c;
int * d;
MyObject( )
{
a = new int[ 8192 ];
b = new int[ 8192 ];
c = new int[ 8192 ];
d = new int[ 8192 ];
}
MyObject( const MyObject & m )
{
a = new int[ 8192 ];
b = new int[ 8192 ];
c = new int[ 8192 ];
d = new int[ 8192 ];
}
~MyObject( )
{
delete [] a;
delete [] b;
delete [] c;
delete [] d;
}
void operator=( const MyObject &m )
{
//Do nothing.
}
};
typedef std::list< MyObject > list_t;
#define MB_TO_ALLOC 1000 // Size in MB that the program must alloc.
#define SLEEP_TIME 5 // Time in seconds that the program must wait until go to another step.
// It's used to give sufficient time for tools update the memory usage
int main( )
{
std::cout << "Alloc..." << std::endl;
list_t * list = new list_t( );
// Number of objects for alloc MB_TO_ALLOC amount of memory
int nObjects = MB_TO_ALLOC * 1024 / 128;
for( int i = 0; i < nObjects; ++i )
list->push_back( MyObject( ) );
std::cout << SLEEP_TIME << "s to Dealloc..." << std::endl;
// Wait some time for a tool (like top) to update the memory usage
sleep( SLEEP_TIME );
std::cout << "Dealloc..." << std::endl;
delete list;
std::cout << SLEEP_TIME << "s to Alloc..." << std::endl;
// Wait some time for a tool (like top) to update the memory usage
sleep( SLEEP_TIME );
//Repeats the procedure for evaluating the reuse of memory
std::cout << "Alloc..." << std::endl;
list = new list_t( );
for( int i = 0; i < nObjects; ++i )
list->push_back( MyObject( ) );
std::cout << SLEEP_TIME << "s to Dealloc..." << std::endl;
sleep( SLEEP_TIME );
delete list;
}
I tried to use simple array or my own list class, but in these cases, the memory is deallocated normally.
Does anyone know what's going on? How to prevent this memory to be "reserved"?
Thanks!
-- Bruno Caponi
In addition to the STL container, it might be the libc itself doing so (the new/delete - malloc/free implementation). Userspace libraries are at liberty to retain memory for later reuse. Allocating / deallocating is an expensive operation (in terms of clock cycles), so many implementations try to avoid this.
gcc STL has its own memory managment layer that grabs big chunks of memory and doesn't give them back ; there is a env variable you can set to make it use raw new calls
GLIBCPP_FORCE_NEW=1
I assume this makes it free it too. This env var is typically used when using valgrind so that valgrind doesnt think things leaked
Memory usage for a container class is handled by the container's allocator (which is passed in as a constructor argument to std::list
, and is std::allocator
by default). The implementation of the default allocator may choose not to return memory to the system straight away, in order to prevent excess fragmentation of the heap.
If you want more direct control over this, you'll probably have to implement a custom allocator.
On linux user memory is allocated to a process from the kernel by a BRK syscall, which extends the data-pointer downward, making more ram available to the process. This is the only way for the kernel to pass normal memory to the process. It is also possible to get memory by using mmap, which allows the process to specify a starting address (other than for the data pointer) but no allocators do this, because it greatly increases the amount of work both the allocator and the kernel allocator must do. For this reason, memory given to a user process can easily be reclaimed by the kernel until the process terminates. If your specific application is making many large allocs/deallocs, then using mmap for those allocs can be a solution to keep the in memory image size down a bit.
I can't say for sure what happens, but the problem (if it is a problem) is reproducible.
I thought it was a memory pool issue, wandering that STL was doing so in the specific allocator used. Drilling down list<>, though, I found just a "new_allocator" that does nothing more that return result of the global new operator:
pointer
allocate(size_type __n, const void* = 0)
{ return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp))); }
The behaviour goes then to glibc or stdlibc++ memory handling, as I understand. In my quick look I could not figure out how to circunvent this behaviour without implementing a custom allocator nor if the custom allocator will surely behave differently.
I tried another test with no STL and then I could see the resources growing up and down. I would suggest to create a custom allocator that allocates an arbitrary number of elements in an array with placement new and cares of allocating/deallocating those arrays. Logic tells me that resource usage must behave just like the "non STL" test.
Please try it and tell us what happens. I won't do it myself because I have no time to spare now, in spite of my curiousness ;)
Note: "Big Three" rule has no influence here. As I understand, there is no memory leak and the contents of the object are not relevant. Bruno could have done data copy, self-assignment check, etc., but just did an empty copy constructor to illustrate its point.
I run into same problem and after long debugging wrote sample program, which ilustrates, that it is some kernel (or maybe g++) issue.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <ctime>
static const size_t CHUNKS_COUNT = 1024 * 1024;
void* chunks[CHUNKS_COUNT];
int main(int argc, const char *argv[]) {
bool additionalAlloc = false;
if (argc > 1)
additionalAlloc = true;
fprintf(stdout, "%lu: starting allocating chunks, additionalAlloc=%d\n",
time(NULL), additionalAlloc);
for (size_t n = 0; n < 2; ++n) {
void* additionalChunk;
// 1GB
for (size_t c = 0; c < CHUNKS_COUNT; ++c) {
static const size_t BASE_CHUNK_SIZE = 1024;
chunks[c] = malloc(BASE_CHUNK_SIZE);
}
if (additionalAlloc) {
// 33 is arbitrary, but for instance for 123 given example
// is not working - magic :-)
additionalChunk = malloc(33);
}
fprintf(stdout, "%lu: finished allocating chunks, n=%lu\n",
time(NULL), n);
sleep(60);
for (size_t c = 0; c < CHUNKS_COUNT; ++c) {
free(chunks[c]);
}
if (additionalAlloc)
free(additionalChunk);
fprintf(stdout, "%lu: finished freeing chunks, n=%lu\n",
time(NULL), n);
sleep(60);
}
sleep(60);
fprintf(stdout, "%lu: finishing program\n", time(NULL));
return 0;
}
When it is run without parameter (additionalAlloc is false), memory is freed to the system after calling free. But when it is run with parameter (additionalAlloc is true), memory is freed only after program finishes. I run it on Debian Squeeze with 4.4.5-1 g++ on 2.6.18-6-xen-amd64 kernel. I don't know how it works on other systems, but seeing that 123 bytes for additional chunk leads to different program behaviour, there is big probability that it won't work - but believe me that for these settings it worked :-)
PS. Can anybody explain why 33 and 123 values for additional chunk leads to different behaviour?
精彩评论