Best strategy for profiling memory usage of my code (open source) and 3rd party code(closed source)

2023-03-09 07:18 问答作者：

I am soon going to be tasked with doing a proper memory profile of a code that is written in C/C++ and uses CUDA to take advantage of GPU processing.

My initial thoughts would be to create macros and operator overloads that would allow me to track calls to malloc, free, delete, and new calls within my source code. I would just be able to include a different header, and use the __FILE__ and __LINE__ macros to print memory calls to a log file. This type of strategy is found here: http://www.almostinfinite.com/m开发者_运维技巧emtrack.html

What is the best way to track that usage in a linked in 3rd party library? I am assuming I'd pretty much only be able to track memory usage before and after the function calls, correct? In my macro/overload scenario, I can simply track the size of the requests to figure out how much memory is being asked for. How would I be able to tell how much the 3rd party lib is using? It is my understanding also, that tracking "free" doesnt really give you any sense of how much the code is using at any particular time, because it is not necessarily returned to the OS. I appreciate any discussion of the matter.

I dont really want to use any memory profiling tools like Totalview or valgrind, because they typically do a lot of other things (bounds checking, etc) that seems to make the software run very slow. Another reason for this is that I want this to be somewhat thread safe - the software uses MPI I believe to spawn processes. I am going to be trying to profile this in real time so I can dump out to log files or something that can be read by another process to visualize memory usage as the software runs. This is also primarily going to be run in a linux environment.

Thanks

Maybe linker option --wrap=symbol can help you. Really good example can be found here: man ld

Maybe valgrind and the Massif tool?

To track real time memory consumption of my programs on Linux I simply read the /proc/[pid]/stat. It's a fairly light operation, could be negligible in your case if the 3rd party library your want to track does consequent work. If you want to have memory information during the 3rd party library work, you can read the stat file into an independent thread or in an other process. (Memory peak rarely append before or after function calls ! ...)

For the CUDA/GPU thing I think gDEBugger could help you. I am not sure but the memory analyzer do not affect performance much.

You could try Google's PerfTools' Heap-Profiler:

http://google-perftools.googlecode.com/svn/trunk/doc/heapprofile.html

It's very lightweight; it literally replaces malloc/calloc/realloc/free to add instrumentation code. It's primarily tested on Linux platforms.

If you have compiled with debugging symbols, and your third-party libraries come with debug-version variants, PerfTools should do very well. If you don't have debug-symbol libraries, build your code with debug symbols anyway. It would give you detailed numbers for your code, and all the leftover can be attributes to the third-party library.

If you don't want to use an "external" tool, you can try to use tools like:

mtrace

It installs handlers for malloc, realloc and free and log every operation to a file. See the Wikipedia I lined for code usage examples.
dmalloc

It's a library you can use in your code, and can find memory leaks, off-by-one errors and usage of invalid addresses. You can also disable it at compile time with -DDMALLOC_DISABLE.

Anyway, I would rather not get this approach. Instead, I suggest you to try and stress test your application while running it on a test server under valgrind (or any equivalent tool) and ensure you're doing memory allocation right, and then let the application run without any memory allocation checking in production to maximize the speed. But, in fact, it depends on what your application do and what your needs are.

You could use the profiler included in Visual Studio 2010 Premium and Ultimate.

It lets you choose between different methods of performance measuring, the most useful for you will probably be CPU sampling because it freezes your program at arbitrary time intervals and figures out which functions it is currently executing, thereby not making your program run substantially slower.

I believe that this question has two very separate answers. One for C/C++ land. And a second for CUDA land.

On the CPU:

I've written my own replacements for new and delete. They were horribly slow and didn't help much. I've used totalview. I like totalview for OpenMP debugging, but I agree very slow for memory debugging. I've never tried valgrind. I've heard similar things.

The only memory debugging tool which I've encountered worth its salt is Intel Parallel Inspector's Memory Checker. Note: As I'm a student, I was able to get an education license on the cheap. That said, it's amazing. It took me twelve minutes to find a memory leak buried in half a million lines of code -- I wasn't releasing a thrown error object which I caught and ignored. I like this one piece of software so much that when my raid failed / Win 7 ate my computer (think autoupdate & raid rebuild simultaneously), I stopped everything and rebuilt the computer because I knew it would take me less time to rebuild the dual boot (48 hours) than it would've to find the memory leak another way. If you don't believe my outlandish claims, download an evaluation version.

On the GPU:

I think you're out of luck. For all memory issues in CUDA, I've essentially had to home grow my own tools and wrappers around cudaMalloc etc. It isn't pretty. nSight does buy you something, but at this point, not much beyond just a "here's how much you've allocated riiiight now. And on that sad note, almost every performance issue I've had with CUDA was directly dependent on my memory access patterns (that or my thread block size).

继续阅读：memory-management profiling

Best strategy for profiling memory usage of my code (open source) and 3rd party code(closed source)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？