Can the compiler elide the following copy?
I'm still a rookie programmer, I know that premature optimization is bad, but I also know that copying huge stuff around is bad, as well.
I've read up on copy elision and it's synony开发者_C百科ms but the examples on Wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.
What about objects like vectors, which usually only make sense when filled with something, when used as a return value. After all, an empty vector could just be instantiated manually.
So, does it also work in a case like this?
bad style for brevity:
vector<foo> bar(string baz)
{
vector<foo> out;
for (each letter in baz)
out.push_back(someTable[letter]);
return out;
}
int main()
{
vector<foo> oof = bar("Hello World");
}
I have no real trouble using bar(vector & out, string text), but the above way would look so much better, aesthetically, and for intent.
the examples on wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.
That is misleading (read: wrong). The issue is rather that only one object is returned in all code paths, i.e. that only one construction for the potential return object is happening.
Your code is fine, any modern compiler can elide the copy.
On the other hand, the following code could potentially generate problems:
vector<int> foo() {
vector<int> a;
vector<int> b;
// … fill both.
bool c;
std::cin >> c;
if (c) return a; else return b;
}
Here, the compiler needs to fully construct two distinct objects, and only later decides which of them are returned, hence it has to copy once because it cannot directly construct the returned object in the target memory location.
There is nothing preventing the compiler from eliding the copy. This is defined in 12.8.15:
[...] This elision of copy operations is permitted in the following circumstances (which may be combined to eliminate multiple copies):
[...]
- when a temporary class object that has not been bound to a reference (12.2) would be copied to a class object with the same cv-unqualified type, the copy operation can be omitted by constructing the temporary object directly into the target of the omitted copy
If it actually does depends on the compiler and the settings you use.
Both implied copies of the vector
can - and often are - eliminated. The named return value optimization can eliminate the copy implied in the return statement return out;
and it is allowed the the for the temporary implied in the copy initialization of oof
to be eliminated as well.
With both optimizations in play the object constructed in vector<foo> out;
is the same object as oof
.
It's easier to test which of these optimizations are being performed with an artificial test case such as this.
struct CopyMe
{
CopyMe();
CopyMe(const CopyMe& x);
CopyMe& operator=(const CopyMe& x);
char data[1024]; // give it some bulk
};
void Mutate(CopyMe&);
CopyMe fn()
{
CopyMe x;
Mutate(x);
return x;
}
int main()
{
CopyMe y = fn();
return 0;
}
The copy constructor is declared but not defined so that calls to it can't be inlined and eliminated. Compiling with a now comparatively old gcc 4.4 gives the following assembly at -O3 -fno-inline
(filtered to demangle C++ names and edited to remove non-code).
fn():
pushq %rbx
movq %rdi, %rbx
call CopyMe::CopyMe()
movq %rbx, %rdi
call Mutate(CopyMe&)
movq %rbx, %rax
popq %rbx
ret
main:
subq $1032, %rsp
movq %rsp, %rdi
call fn()
xorl %eax, %eax
addq $1032, %rsp
ret
As can be seen there are no calls to the copy constructor. In fact, gcc performs these optimizations even at -O0
. You have to provide the -fno-elide-constructors
to turn this behaviour off; if you do this then gcc generates two calls to the copy constructor of CopyMe
- one inside and one outside of the call to fn()
.
fn():
movq %rbx, -16(%rsp)
movq %rbp, -8(%rsp)
subq $1048, %rsp
movq %rdi, %rbx
movq %rsp, %rdi
call CopyMe::CopyMe()
movq %rsp, %rdi
call Mutate(CopyMe&)
movq %rsp, %rsi
movq %rbx, %rdi
call CopyMe::CopyMe(CopyMe const&)
movq %rbx, %rax
movq 1040(%rsp), %rbp
movq 1032(%rsp), %rbx
addq $1048, %rsp
ret
main:
pushq %rbx
subq $2048, %rsp
movq %rsp, %rdi
call fn()
leaq 1024(%rsp), %rdi
movq %rsp, %rsi
call CopyMe::CopyMe(CopyMe const&)
xorl %eax, %eax
addq $2048, %rsp
popq %rbx
ret
精彩评论