memory use of STL data structures, windows vs. linux
I have a program that heavily uses std::map
. Under Windows, much more memory is used as under Linux. Has anyone an idea why this happens?
Linux:
Last process took 42.31 s and used not more than 909 MB (RSS 900 MB) of memory
Windows:
Last process took 75.373 s and used not more than 1394 MB (RSS 1395 MB) of memory
I use gcc 4.4.3 and the VS 2010 C++ compiler on the command line, with release settings.
EDIT: Sorry for answering the questions that late...
The code looks like this:
enum Symbol {
...
}
class GraphEntry {
public:
...
virtual void setAttribute (Symbol name, Value * value) = 0;
const Value * attribute (Symbol name) const;
private:
std::map<Symbol, Attribute> m_attributes;
};
class Attribute {
public:
Attribute (Symbol name, Value * val);
...
Symbol name () const;
Value * valuePointer () const;
void setValuePointer (Value * p);
private:
Symbol m_name;
Value * m_value;
};
class Graph : public GraphEntry {
...
public:
Node * newNode (...);
Graph * newSubGraph (...);
Edge * newEdge (...);
...
setSomeAttribute (int x);
setSomeOtherAttribute (float f);
...
private:
std::vector<GraphEntry *> m_entries;
};
The whole thing describes a graph structure, which can hold some attributes on its nodes and edges. Value
is just a base class, and the derived cla开发者_如何学Gosses can hold values with arbitrary types, like int
or std::string
.
EDIT 2:
Under Windows, I use the following flags: -DRELEASE -DNDEBUG -DQT_NO_DEBUG -DQT_NO_DEBUG_OUTPUT -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -DNOMINMAX /O2 /MD /Gy /EHsc
EDIT 3:
The memory usage is read from a /proc file under linux (like memuse
).
Under Windows, some WinAPI functions are called, but I am not the expert for this, so that's all what I can say about it.
EDIT 4:
Using /GS-
and -D_SECURE_SCL
results in Last process took 170.281 s and used not more than 1391 MB (RSS 1393 MB) of memory
You'll observe that memory usage on Windows is somewhere between 1 and 2 times greater. Heap algorithms aside, Windows malloc()
, and subsequently any data structures allocated on the heap via new
(such as std::map
's nodes with the default allocator type), are aligned to 16 bytes. On Linux, glibc defaults to 8 byte alignment. Assuming some smoothing in differences due to fragmentation, optimization reaping of unused pages etc. you can expect the differences to become less apparent
A quick check of your code indicates map key and value types should be 4 and 8 bytes respectively (Symbol
and Attribute
). These will round up to 8 bytes on Linux, and 16 bytes on Windows. You should have an equal number of map nodes, at least in the MSVC implementation, these look to consume a minimum of 22 bytes, which MSVC will expand to 32 due to its member alignment rules, which is also its malloc granularity. GCC will expand its to 24, meaning an approximate total of 48 bytes in MSVC to GCC/Linux' 32 per node. Roughly 50% more memory usage on Windows.
Here's the node structure used in MSVC, I can look up the GCC equivalent if you are interested:
struct _Node
{ // tree node
_Nodeptr _Left; // left subtree, or smallest element if head
_Nodeptr _Parent; // parent, or root of tree if head
_Nodeptr _Right; // right subtree, or largest element if head
value_type _Myval; // the stored value, unused if head
char _Color; // _Red or _Black, _Black if head
char _Isnil; // true only if head (also nil) node
I'll add for those who are unfamiliar with how memory usage works, there are several factors at play:
- Memory is allocated in chunks rounding up to the next multiple of the alignment for the allocation mechanism used. For the heap, the
malloc()
alignment rules are in use (unless you subvert the usual heap or use some other allocator than the default). - Virtual memory is "provided" by the system in chunks known as pages, integer multiples of the frame size, which is beyond the scope of this question. This has a minor effect on the answer, as the memory usage is so massive compared with the page size in question (4K), and the page size in turn is so massive compared to the alignments in use (8 and 16).
Each compiler is shipped with its own implementation of the STL, therefore you are comparing:
- GCC STL + Linux allocation routines
- VC++ STL + Windows allocation routines
It's quite difficult to draw a meaningful comparison here because you don't know which of the allocation routine or STL implementation (or possibly both) is actually responsible.
I do suppose that you are not comparing a 32-bits program with a 64-bits program, since this would be even less meaningful.
Some versions of VC++ use checked iterators (_SECURE_SCL) in release builds, too. VC2005 and VC2008 have them turned on by default. VC2010 disables them by default
Depending on your compiler, that could be another thing to check (and turn off).
Did you execute the test in release or debug mode under windows? STL in debug mode does a lot of extra checking; maybe it also uses more memory to be able to perform all checks.
For VC++ try to use /GS- command line switch.
When you say memory used is "not more than" are you referring to the peak memory usage or the mean memory usage during the lifetime of the application?
Make sure if the memory allocated by your application using 'new' or 'malloc' or any other memory allocation library call are released back using 'delete' or 'free' or any equivalent library call.
In Linux, you can use valgrind and check for memory leaks.
精彩评论