Can gdb or other tool be used to detect parts of a complex program (e.g. loops) that take more time than expected for targeting optimization?
As the开发者_如何学Python title implies basically: Say we have a complex program and we want to make it faster however we can. Can we somehow detect which loops or other parts of its structure take most of the time for targeting them for optimizations?
edit: Notice, of importance is that the software is assumed to be very complex and we can't check each loop or other structure one by one, putting timers in them etc..
You're looking for a profiler. There are several around; since you mention gcc you might want to check gprof (part of binutils). There's also Google Perf Tools although I have never used them.
You can use GDB for that, by this method.
Here's a blow-by-blow example of using it to optimize a realistically complex program.
You may find "hotspots" that you can optimize, but more generally the things that give you the greatest opportunity for saving time are mid-level function calls that you can avoid.
One example is, say, calling a function to extract information from a database, where the function is being called multiple times, when with some extra coding the result from a prior call could be used. Often such calls are small and innocent-looking, and you're totally surprised to learn how much they're costing, as an overall percent of time.
Another example is doing some low-level I/O that escapes attention, but actually costs a hefty percent of clock time.
Another example is tidal waves of notifications that propagate from seemingly trivial changes to data.
Another good tool for finding these problems is Zoom.
Here's a discussion of the technical issues, but basically what to look for is:
It should tell you inclusive percent of time, at line-level resolution, not just functions. a) Only knowing that a function is costly still leaves you wondering where the lines are in it that you should look at. b) Inclusive percent tells the true cost of the line - how much bottom-line time it is responsible for and would not be spent if it were not there.
It should include both I/O (i.e. blocked) time and CPU time, not just CPU time. A tool that only considers CPU time will not see the first two problems mentioned above.
If your program is interactive, the tool should operate only during the time you care about, and not while waiting for user input. You don't want to include head-scratching time in your program's performance statistics.
gprof
breaks it down by function. If you have many different loops in one function, it might not tell you which loop is taking the time. This is a clue to refactor ;-)
精彩评论