开发者

How does a memory leak improve performance

I'm building a large RTree (spati开发者_StackOverflowal index) full of nodes. It needs to be able to handle many queries AND updates. Objects are continuously being created and destroyed. The basic test I'm running is to see the performance of the tree as the number of objects in the tree increases. I insert from 100-20000 uniformly size, randomly located objects in increments of 100. Searching and updating are irrelevant to the issue I am currently faced with.

Now, when there is NO memory leak the "insert into tree" performance is everywhere. It goes anywhere from 10.5 seconds with ~15000 objects to 1.5 with ~18000. There is no pattern whatsoever.

When I deliberately add in a leak, as simple as putting in "new int;" I don't assign it to anything, that right there is a line to itself, the performance instantly falls onto a nice gentle curve sloping from 0 (roughly) seconds for 100 objects to 1.5 for the full 20k.

Very, very lost at this point. If you want source code I can include it but it's huuugggeeee and literally the only line that makes a difference is "new int;"

Thanks in advance! -nick


I'm not sure how you came up with this new int test, but it's not very good way to fix things :) Run your code using a profiler and find out where the real delays are. Then concentrate on fixing the hot spots.

g++ has it built in - just compile with -pg


Without more information it's impossible to be sure.

However I wonder if this is to do with heap fragmentation. By creating a freeing many blocks of memory you'll likely be creating a whole load of small fragments of memory linked together.The memory manager needs to keep track of them all so it can allocate them again if needed.

Some memory managers when you free a block try to "merge" it with surrounding blocks of memory and on a highly fragmented heap this can be very slow as it tries to find the surrounding blocks. Not only this, but if you have limited physical memory it can "touch " many physical pages of memory as it follows the chain of memory blocks which can cause a whole load of extremely slow page faults which will be very variable in speed depending on exactly how much physical memory the OS decides to give that process.

By leaving some un-freed memory you will be changing this pattern of access which might make a large difference to the speed. You might for example be forcing the run time library to allocate new block of memory each time rather than having to track down a suitably sized existing block to reuse.

I have no evidence this is the case in your program, but I do know that memory fragmentation is often the causes of slow programs when a lot of new and free is performed.


The possible thing that is happening which explains this (a theory)

  1. The compiler did not remove the empty new int
  2. The new int is in one of the inner loops or somewhere in your recursive traversal wherein it gets executed the most amount of time
  3. The overall RSS of the process increases and eventually the total memory being used by the process
  4. There are page faults happening because of this
  5. Because of the page-faults, the process becomes I/O bound instead of being CPU bound End result, you see a drop in the throughput. It will help if you can mention the compiler being used and the options for the compiler that you are using to build the code.


I am taking a stab in the dark here but the problem could be the way the heap gets fragmented. You said that you are creating a destroying large numbers of objects. I will assume that the objects are all of different size.

When one allocates memory on the heap, a cell the size needed is broken off from the heap. When the memory is freed, the cell is added to a freelist. When one does a new alloc, the allocator walks the heap until a cell that is big enough is found. When doing large numbers of allocations, the free list can get rather long and walking the list can take a non-trivial amount of time.

Now an int is rather small. So when you do your new int, it may well eat up all the small heap cells on the free list and thus dramatically speed up larger allocations.

The chances are, however that you are allocating and freeing similar sized objects. If you use your own freelists, you will safe yourself many heap walks and may dramatically improve performance. This is exactly what the STL allocators do to improve performance.


Solution: Do not run from Visual Studio. Actually run the .exe file. Figured this out because that's what the profilers were doing and the numbers were magically dropping. Checked memory usage and version running (and giving me EXCEPTIONAL times) was not blowing up to excessively huge sizes.

Solution to why the hell Visual Studio does ridiculous crap like this: No clue.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜