Comparing performance of software implementations
HI all,
This is a more general question, but basically I wanna compare the performance of two multimedia software applications. Although they are doing the same thing, they are running on different platforms and also nothing is known about the implementation. I get quite some different performance figures and I am trying to reason about what could be the ca开发者_如何转开发se for that. So far I came up with the following:
Better performance due to software optimisations:
- loop-unrolling at the cost of higher code memory footprint
- pro-computation of results stored in memory at the cost of a higher data memory footprint
Better performance due to the underlying hardware architectures
- running at a higher clock speed
- offering better hardware support for an application
- better caching opportunities
Can someone think of something else or is that all?
Thanks, Simon
I'd say:
- If you know nothing about implementations, you won't be interested how that implementation works, including loop unrolling and everything.
- On that abstraction level, you'd probably want to measure end-user-related preformance goals, like in this Wikipedia article.
- Particularly, for single-user system response time and thoughput are more important; while for multi-user systems concurrency and throughput would matter. The former are affected by clock speed and UI design; the latter is also affected by cache size and overall system behavior under high loads.
- Martin Fowler has a good review of performance metrics in his PoEA,
Good question. Hardware certainly makes a difference, but software structural differences do too.
It is highly unlikely that micro-level optimizations like unrolling make much difference.
Reasoning about it will not take you very far - you need to investigate.
I'm not saying you can get a definitive answer to this question, but here's what I would do. Somehow, get 10 or 20 random-time stack samples, whether by interrupt-and-dump, pstack, lsstack, or running under a debugger and using Ctrl-C
, or a good stack-sampling profiler like RotateRight/Zoom.
You can look at those samples and get an idea, percentage-wise, how each program is spending its time. If they are both near optimal, the pictures should look pretty similar, even if you don't know exactly what they're doing. If one, say, is spending a higher percentage of time in memory management, that's a red flag. If the call stack is typically much deeper on one than on the other, that's also cause for suspicion, not because calls are expensive, but because an over-general wasteful coding style tends to show that.
If you don't have symbols, it may take a fair amount of detective work to figure this out, and you may not be able to, but that is how I would approach it.
精彩评论