What's the best way to measure and track performance over various calls at runtime?

2022-12-21 14:48 问答作者：

I'm trying to optimize the performance of my code, but I'm not familiar with xcode's debuggers or debuggers in general. Is it possible to track the execution time and frequency of calls being made at runtime?

Imagine a chain of events with some 开发者_如何学Pythonrecursive calls over a fraction of a second. What's the best way to track where the CPU spends most of its time?

Many thanks.

Edit: Maybe this is better asked by saying, how do I use the xcode debug tools to do a stack trace?

You want to use the built-in performance tools called 'Instruments', check out Apples guide to Instruments. Specifically you probably want the System Instruments. There's also the Tuning Guide which could be useful to you and Shark.

Imagine a chain of events with some recursive calls over a fraction of a second. What's the best way to track where the CPU spends most of its time?

Short version of previous answer.

Learn an IDE or debugger. Make sure it has a "pause" button or you can interrupt it when your program is running and taking too long.
If your code runs too quickly to be manually paused, wrap a temporary loop of 10 to 1000 times around it.
When you pause it, make a copy of the call stack, into some text editor. Repeat several times.

Your answer will be in those stacks. If the CPU is spending most of its time in a statement, that statement will be at the bottom of most of the stack samples. If there is some function call that causes most of the time to be used, that function call will be on most of the stacks. It doesn't matter if it's recursive - that just means it shows up more than once on a stack.

Don't think about measuring microseconds, or counting calls. Think about "percent of time active". That's what stack samples tell you, and that's roughly what you'll save if you fix it.

It's that simple.

BTW, when you fix that problem, you will get a speedup factor. Then, other issues in your code will be magnified by that factor, so they will be easier to find. This way, you can keep going until you've squeezed every cycle out of it.

The first thing I tell people is to recognize the difference between

1) timing routines and counting how many times they are called, and

2) finding code that you can fruitfully optimize.

For (1) there are instrumenting profilers. To be really successful at (2) you need a rare type of profiler. You need a sampling profiler that

samples the entire call stack, not just the program counter
samples at random wall clock times, not just CPU, so as to capture possible I/O problems
samples when you want it to (not when waiting for user input)
for output, gives you, for each line of code that appears on stack samples, the percent of samples containing that line. That is a direct measure of the total time that could be saved if that line were not there.

(I actually do it by hand, interrupting the program under the debugger.)

Don't get sidetracked by problems you don't have, such as

accuracy of measurement. If a line of code appears on 30% of call stack samples, it's actual cost could be anywhere in a range around 30%. If you can find a way to eliminate it or invoke it a lot less, you will save what it costs, even if you don't know in advance exactly what its cost is.
efficiency of sampling. Since you don't need accuracy of time measurement, you don't need a large number of samples. Even if you get a large number of samples, they don't skew the results significantly, because they don't fail to spot the costly lines of code.
call graphs. They make nice graphics, but are not what you need to know. An arc on a call graph corresponds to a line of code in the best case, usually multiple lines, so knowing cost of an arc only tells the cost of a line in the best case. Call graphs concentrate on functions, when what you need to find is lines of code. Call graphs get wrapped up in the issue of recursion, which is irrelevant.

It's important to understand what to expect. Many programmers, using traditional profilers, can get a 20% improvement, consider that terrific, count the profiler a winner, and stop there. Others, working with large programs, can often get speedup factors of 20 times. This is done by fixing a series of problems, each one giving a multiplicative speedup factor. As soon as the profiler fails to find the next problem, the process stops. That's why "good enough" isn't good enough.

Here is a brief explanation of the method.

继续阅读：debugging optimization performance xcode

What's the best way to measure and track performance over various calls at runtime?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？