开发者

Undefined Behavior with the C++0x Closure: II

I find the use of the C++0x closure perplexing. My initial report, and the subsequent one, have generated more confusion than explanations. Below I will show you troublesome examples, and I hope to find out why there is an undefined behavior in the code. All the pieces of the code pass the gcc 4.6.0 compiler without any warning.

Program No. 1: It Works

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [=](int y) -> int { 
            return x+y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The output meets the expectations:

2 2 2

2 2 2

2 2 2

2. Program No. 2: Closure, Works Fine

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " <<开发者_开发技巧 ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The output is:

4 3 2

7 6 5

10 9 8

Program 3: Program No. 1 with std::function, Works Fine

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [=](int y) -> int { 
            return x+y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}   

The output is:

2 2 2

2 2 2

2 2 2

Program 4: Program No. 2 with std::function, Undefined Behavior

#include <iostream>
#include <functional>     // std::function

int main(){

    typedef std::function<int(int)> fint2int_type;
    typedef std::function<fint2int_type(int)> parent_lambda_type;

    parent_lambda_type accumulator = [](int x) -> fint2int_type{
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };

    fint2int_type ac=accumulator(1);

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

The first run of the program gives:

4 3 2

4 3 2

12364812 12364811 12364810

The second run of the same program:

4 3 2

4 3 2

1666060 1666059 1666058

The third one:

4 3 2

4 3 2

2182156 2182155 2182154

How does my use of the std::function break the code? why do Programs No.1 - 3 work well, and Program No. 4 is correct when calling ac(1) thrice(!)? Why does Program No. 4 get stuck on the next three cases as if the variable x had been captured by value, not reference. And the last three calls of ac(1) are totally unpredictable as if any reference to x would be lost.


I hope to find out why there is an undefined behavior in the code

Every time I deal with complex and intricated lambda, I feel it more easier to do first the translation into function-object form. Because lambdas are just syntactic sugar for function-object and for each lambda there is a one-to-one mapping with a corresponding function-object. This article explain really well how to do the translation : http://blogs.msdn.com/b/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx

So for example, your program no 2 :

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

would be approximately translate by the compiler into this one :

#include <iostream>

struct InnerAccumulator
{
    int& x;
    InnerAccumulator(int& x):x(x)
    {
    }
    int operator()(int y) const
    {
        return x+=y;
    }
};

struct Accumulator
{
    InnerAccumulator operator()(int x) const
    {
        return InnerAccumulator(x); // constructor
    }
};


int main()
{
    Accumulator accumulator;
    InnerAccumulator ac = accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

And now, the problem become quite obvious :

InnerAccumulator operator()(int x) const
{
   return InnerAccumulator(x); // constructor
}

Here the constructor of InnerAccumulator will take a reference to x, a local variable which will die as soon as you exit the operator() scope. So yes, you just get a plain good old undefined behavior as you suspected.


Let's try something entirely innocent-looking:

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);

    //// Surely this should be a no-op? 
    accumulator(666);
    //// There are no side effects and we throw the result away!

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 
}

Tada:

669 668 667 
672 671 670 
675 674 673 

Of course, this is not guaranteed behaviour either. Indeed, with optimizations enabled, gcc will eliminate the accumulator(666) call figuring it's dead code, and we again get the original results. And it is entirely within its rights to do so; in a conforming program, removing the call would indeed not affect the semantics. But in the realm of undefined behaviour, anything may happen.


EDIT

auto ac=accumulator(1);

std::cout << pow(2,2) << std::endl;

std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 

Without optimizations enabled, I get the following:

4
1074790403 1074790402 1074790401 
1074790406 1074790405 1074790404 
1074790409 1074790408 1074790407 

With optimizations enabled,

4
4 3 2 
7 6 5 
10 9 8

Again, C++ does not and cannot provide true lexical closures where the lifetime of local variables would get extended beyond their original scope. That would entail bringing garbage collection and heap-based locals to the language.

This is all rather academic, though, as capturing x by copy makes the program well-defined and to work as expected:

auto accumulator = [](int x) {
    return [x](int y) mutable -> int { 
        return x += y;
    }; 
};


Well, references become dangling when the referent goes away. You have a very fragile design if object A has a reference to some part of object B, unless object A in some way can guarantee the lifetime of object B (for instance, when A holds a shared_ptr to B anyway, or both are in the same scope).

References in lambda's are no magical exception. If you plan to return a reference to x+=y, you'd better make sure that x lives long enough. Here it's the argument int x initialized as part of the call accumulator(1). The lifetime of a function argument ends when the function returns.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜