Performance Cost of a Memcopy in C/C++
So whenever I write code I always think about the performance implications. I've often wondered, what is the "cost" of using a memcopy relative to other functions in terms of performance?
For example, I may be writing a sequence of numbers to a static buffer and concentrate on a frame within the buffer, in order to keep the frame once I get to the end of the buffer, I mig开发者_运维技巧ht memcopy all of it to the beginning OR I can implement an algorithm to amortize the computation.
memcpy is generally optimized to maximize memory bandwidth of large copies. Of course, it's not as fast as avoiding a copy completely, and for short copies of fixed size, direct assignment may be faster since memcpy has extra code to deal with odd lengths.
But when you need to copy a block of memory, it's hard to beat memcpy. It's highly portable and most compilers go to great lengths to make it fast, whether that's using SIMD instructions or maybe inlining.
It's ok to consider performance implications, but don't become too distracted from the real goal of writing good clean code. If you are inclined to obsess about performance even when you know better, try to focus on higher level implications, and ignore the bit-by-bit stuff such as memcpy
, which you can trust the compiler and library authors to optimize.
Generally avoid premature optimization of this low-level kind because it consumes your time, the effects bubble up to infect the entire program, and without measurements, you cannot expect to achieve any performance gains.
Consider McCormick's book 'Code Complete'. Stealing shamelessly from there ---
Algorithm improvement usually has the biggest payback in performance.
Simple statements allow the compiler to optimize effectively. These have low programmer cost. They usually increase readability. They are a low cost default 'should' anyway.
As mentioned memcpy
has already been tweaked and is often really effective on larger memory blocks. So why avoid it if the situation dictates keeping data?
In general do not optimize for no reason. Suppose you write a report against a massive dataset. No user expects to have an instant response in that scenario. They start the job and go get a snack. So if your code runs in 10 minutes or three minutes it doesn't matter. To them. Thet won't notice. And... they write your paycheck.
Programmer optimization is a huge upfront cost. So spend that cost only where needed.
Well, first - you should think about performance only if memory copying is your bottleneck (and it's really a rare case).
Second, memcpy
is implemented using assembler (see memcpy.asm
) and, I guess, is the fastest memory copying solution available.
Also to mention, in general raw memcpy
calls in C++ should be avoided, try using more abstracted wrappers and routines.
memcpy()
copies the memory contents in source to dest. Copying obviously is linear to the amount of elements in the source. What constitutes the optimal size of an element is machine dependet. Anyway a lot of compiler otimization black magic can apply depending on the context of the operation. In C++ it is generally wiser to avoid memcpy
and use assignment or copy constructors.
精彩评论