Performance profiling on Linux
What are the best tools for profiling C/C++ applications on *nix?
(I'm hoping to profile a server that is a mix of (blocking) file IO, epoll for netw开发者_如何学运维ork and fork()/execv() for some heavy lifting; but general help and more general tools are all also appreciated.)
Can you get the big system picture of RAM, CPU, network and disk all in one overview, and drill into it?
There's been a lot of talk on the kernel lists about things like perf timechart
, but I haven't found anything turning up in Ubuntu yet.
I recommend taking stackshots, for which pstack is useful. Here's some more information:
Comments on gprof.
How stackshots work.
A blow-by-blow example.
A very short explanation.
If you want to spend money, Zoom looks like a pretty good tool.
For performance, you can try Callgrind, a Valgrind tool. Here is a nice article showing it in action.
Compile with -pg, run the program, and then use gprof
Compiling (and linking) with -pg adds profiling code and the profiling libraries to the executable, which then produces a file called gmon.out that contains the timing information. gprof displays call graphs and their (absolute and relative) timings.
See man gprof
for details.
The canonical example of a full system profiling tool (for Solaris, OS X, FreeBSD) is DTrace. But it is not yet fully available on Linux (you can try here but the site is down for me at the moment, and I haven't tried it myself). There are many tools, in various states of usefulness, for doing full system profiling and kernel profiling on Linux.
You might consider investigating:
- oprofile
- SystemTap
- bootchart
- strace (e.g. this SO answer
Allinea MAP is a profiler for C++ and other native languages on Linux. It is commercially supported by my employer. It has a graphical interface and source-line level profiling and profiles code with almost no slowdown which makes it very accurate where timing of other subsystems is relevant - such as for IO.
Callgrind has been useful and accurate - but the slowdown was ~5x so I could only do smaller runs. It can actually count the number of times a function is called which is useful for understanding asymptotic behavior.
Description of using -gp and gproff here http://www.ibm.com/developerworks/library/l-gnuprof.html
oprofile might interest you. Ubuntu should have all the packages you need.
If you can take your application to freeBSD, OS X , or Solaris you can use dtrace, although dtrace is an analyst oriented tool -- i.e., you need to drive it -- read: script it. Nothing else can give you the level of granularity you need; Dtrace can not just profile the latencies of function calls in user-land; it can also follow a context switch into the kernel.
As mentioned in the accepted answer, Zoom can do some amazing things. I've used it to understand thread behavior all the way down to optimizing the assembly generated by the compiler.
The FOSS answer, as already mentioned, is to build with -pg and then use gprof to analyse the output. If it's a product/project that justifies throwing some money at, I would also be tempted to use IBM/Rationals Quantify profiler as that makes it easier to view the profiling data, drill down to the line level or look at it in a '10000ft' level.
Of course there might be viewer for gprof available that can do the same thing, but I am not aware of any.
精彩评论