Performance difference between iterating once and iterating twice?
Consider something like...
for (int i = 0; i < test.size(); ++i) {
test[i].foo();
test[i].bar();
}
Now consider..
for (int i = 0; i < test.size(); ++i) {
test[i].foo();
}
for (int i = 0; i < test.size(); ++i) {
test[i].bar();
}
Is there a large difference in time spent between these two? I.e. what is the cost of the actual iteration? It seems like the only real operations you are repeating are an incremen开发者_如何学运维t and a comparison (though I suppose this would become significant for a very large n). Am I missing something?
First, as noted above, if your compiler can't optimize the size()
method out so it's just called once, or is nothing more than a single read (no function call overhead), then it will hurt.
There is a second effect you may want to be concerned with, though. If your container size is large enough, then the first case will perform faster. This is because, when it gets to test[i].bar()
, test[i]
will be cached. The second case, with split loops, will thrash the cache, since test[i]
will always need to be reloaded from main memory for each function.
Worse, if your container (std::vector
, I'm guessing) has so many items that it won't all fit in memory, and some of it has to live in swap on your disk, then the difference will be huge as you have to load things in from disk twice.
However, there is one final thing that you have to consider: all this only makes a difference if there is no order dependency between the function calls (really, between different objects in the container). Because, if you work it out, the first case does:
test[0].foo();
test[0].bar();
test[1].foo();
test[1].bar();
test[2].foo();
test[2].bar();
// ...
test[test.size()-1].foo();
test[test.size()-1].bar();
while the second does:
test[0].foo();
test[1].foo();
test[2].foo();
// ...
test[test.size()-1].foo();
test[0].bar();
test[1].bar();
test[2].bar();
// ...
test[test.size()-1].bar();
So if your bar()
assumes that all foo()
's have run, you will break it if you change the second case to the first. Likewise, if bar()
assumes that foo()
has not been run on later objects, then moving from the second case to the first will break your code.
So be careful and document what you do.
There are many aspects in such comparison.
First, complexity for both options is O(n)
, so difference isn't very big anyway. I mean, you must not care about it if you write quite big and complex program with a large n
and "heavy" operations .foo()
and bar()
. So, you must care about it only in case of very small simple programs (this is kind of programs for embedded devices, for example).
Second, it will depend on programming language and compiler. I'm assured that, for instance, most of C++ compilers will optimize your second option to produce same code as for the first one.
Third, if compiler haven't optimized your code, performance difference will heavily depend on the target processor. Consider loop in a term of assembly commands - it will look something like this (pseudo assembly language):
LABEL L1:
do this ;; some commands
call that
IF condition
goto L1
;; some more instructions, ELSE part
I.e. every loop passage is just IF
statement. But modern processors don't like IF
. This is because processors may rearrange instructions to execute them beforehand or just to avoid idles. With the IF
(in fact, conditional goto or jump) instructions, processors do not know if they may rearrange operation or not.
There's also a mechanism called branch predictor. From material of Wikipedia:
branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure.
This "soften" effect of IF
's, through if the predictor's guess is wrong, no optimization will be performed.
So, you can see that there's a big amount of conditions for both your options: target language and compiler, target machine, it's processor and branch predictor. This all makes very complex system, and you cannot foresee what exact result you will get. I believe, that if you don't deal with embedded systems or something like that, the best solution is just to use the form which your are more comfortable with.
For your examples you have the additional concern of how expensive .size()
is, since it's compared for each time i
increments in most languages.
How expensive is it? Well that depends, it's certainly all relative. If .foo()
and .bar()
are expensive, the cost of the actual iteration is probably minuscule in comparison. If they're pretty lightweight, then it'll be a larger percentage of your execution time. If you want to know about a particular case test it, this is the only way to be sure about your specific scenario.
Personally, I'd go with the single iteration to be on the cheap side for sure (unless you need the .foo()
calls to happen before the .bar()
calls).
I assume .size()
will be constant. Otherwise, the first code example might not give the same as the second one.
Most compilers would probably store .size()
in a variable before the loop starts, so the .size()
time will be cut down.
Therefore the time of the stuff inside the two for loops will be the same, but the other part will be twice as much.
Performance tag, right.
As long as you are concentrating on the "cost" of this or that minor code segment, you are oblivious to the bigger picture (isolation); and your intention is to justify something that, at a higher level (outside your isolated context), is simply bad practice, and breaks guidelines. The question is too low level and therefore too isolated. A system or program which is set of integrated components will perform much better that a collection of isolated components.
The fact that this or that isolated component (work inside the loop) is fast or faster is irrelevant when the loop itself is repeated unnecessarily, and which would therefore take twice the time.
Given that you have one family car (CPU), why on Earth would you:
- sit at home and send your wife out to do her shopping
- wait until she returns
- take the car, go out and do your shopping
- leaving her to wait until you return If it needs to be stated, you would spend (a) almost half of your hard-earned resources executing one trip and shopping at the same time and (b) have those resources available to have fun together when you get home.
It has nothing to do with the price of petrol at 9:00 on a Saturday, or the time it takes to grind coffee at the café, or cost of each iteration.
Yes, there is a large diff in the time and the resources used. But the cost is not merely in the overhead per iteration; it is in the overall cost of the one organised trip vs the two serial trips.
Performance is about architecture; never doing anything twice (that you can do once), which are the higher levels of organisation; integrated of the parts that make up the whole. It is not about counting pennies at the bowser or cycles per iteration; those are lower orders of organisation; which ajust a collection of fragmented parts (not a systemic whole).
Masseratis cannot get through traffic jams any faster than station wagons.
精彩评论