开发者

Can different optimization levels lead to functionally different code?

I am curious about the liberties that a compiler has when optimizing. Let's limit this question to GCC and C/C++ (any version, any flavour of standard):

Is it possible to write code which behaves differently depend开发者_StackOverflow中文版ing on which optimization level it was compiled with?

The example I have in mind is printing different bits of text in various constructors in C++ and getting a difference depending on whether copies are elided (though I've not been able to make such a thing work).

Counting clock cycles is not permitted. If you have an example for a non-GCC compiler, I'd be curious, too, but I can't check it. Bonus points for an example in C. :-)

Edit: The example code should be standard compliant and not contain undefined behaviour from the outset.

Edit 2: Got some great answers already! Let me up the stakes a bit: The code must constitute a well-formed program and be standards-compliant, and it must compile to correct, deterministic programs in every optimization level. (That excludes things like race-conditions in ill-formed multithreaded code.) Also I appreciate that floating point rounding may be affected, but let's discount that.

I just hit 800 reputation, so I think I shall blow 50 reputation as bounty on the first complete example to conform to (the spirit) of those conditions; 25 if it involves abusing strict aliasing. (Subject to someone showing me how to send bounty to someone else.)


The portion of the C++ standard that applies is §1.9 "Program execution". It reads, in part:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. ...

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. ...

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions. ...

So, yes, code may behave differently at different optimization levels, but (assuming that all levels produce a conforming compiler), but they cannot behave observably differently.

EDIT: Allow me to correct my conclusion: Yes, code may behave differently at different optimization levels as long as each behavior is observably identical to one of the behaviors of the standard's abstract machine.


Floating point calculations are a ripe source for differences. Depending on how the individual operations are ordered, you can get more/less rounding errors.

Less than safe multi-threaded code can also have different results depending on how memory accesses are optimized, but that's essentially a bug in your code anyhow.

And as you mentioned, side effects in copy constructors can vanish when optimization levels change.


Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

Only if you trigger a compiler's bug.

EDIT

This example behaves differently on gcc 4.5.2:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

Compiled with -O0 creates a program crashing with a segmentation fault.
Compiled with -O2 creates a program entering an endless loop.


OK, my flagrant play for the bounty, by providing a concrete example. I'll put together the bits from other people's answers and my comments.

For the purpose of different behaviour at different optimizations levels, "optimization level A" shall denote gcc -O0 (I'm using version 4.3.4, but it doesn't matter much, I think any even vaguely recent version will show the difference I'm after), and "optimization level B" shall denote gcc -O0 -fno-elide-constructors.

Code is simple:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

Output at optimization level A:

~Foo

Output at optimization level B:

~Foo
~Foo

The code is totally legal, but the output is implementation-dependent because of copy constructor elision, and in particular it's sensitive to gcc's optimization flag that disables copy ctor elision.

Note that generally speaking, "optimization" refers to compiler transformations that can alter behavior that is undefined, unspecified or implementation-defined, but not behavior that is defined by the standard. So any example that satisfies your criteria necessarily is a program whose output is either unspecified or implementation-defined. In this case it's unspecified by the standard whether copy ctors are elided, I just happen to be lucky that GCC reliably elides them pretty much whenever allowed, but has an option to disable that.


For C, almost all operations are strictly defined in the abstract machine and optimizations are only allowed if the observable result is exactly that of that abstract machine. Exceptions of that rule that come to mind:

  • undefined behavior don't has to be consistent between different compiler runs or executions of the faulty code
  • floating point operations may cause different rounding
  • arguments to function calls can be evaluated in any order
  • expressions with volatile qualified type may or may not be evaluated just for their side effects
  • identical const qualified compound literals may or may be not folded into one static memory location


Anything that is Undefined Behavior according to the standard can change its behavior depending on optimization level (or moon-phase, for that matter).


Since copy constructor calls can be optimized away, even if they have side effects, having copy constructors with side-effects will cause unoptimized and optimized code to behave differently.


The -fstrict-aliasing option can easily cause changes in behavior if you have two pointers to the same block of memory. This is supposed to be invalid but is actually quite common.


This C program invokes undefined behavior, but does display different results in different optimization levels:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}


gcc defines __OPTIMIZE__ macro when non-zero optimization level is used. You can use it like below:

#ifdef __OPTIMIZE__
printf("Code compiled with -O1 or higher\n");
#else
printf("Code compiled with -O0\n");
#endif


same source code like

Can different optimization levels lead to functionally different code?

before enable -finline-small-functions and after enable -finline-small-functions

Can different optimization levels lead to functionally different code?

Can different optimization levels lead to functionally different code?

-finline-small-functions can be enabled in -O2/-O3


Two different C programs:

foo6.c

void p2(void);

int main() {
    p2();
    return 0;
}

bar6.c

#include <stdio.h>

char main;

void p2() {
    printf("0x%x\n", main);
}

When both modules are compiled into one excecutable with optimization levels one and zero, they print out two different values. 0x48 for -O1 and 0x55 for -O0

Screenshot of terminal

Here is an example of it working in my environment


a.c:

char *f1(void) { return "hello"; }

b.c:

#include <stdio.h>

char *f1(void);

int main()
{
    if (f1() == "hello") printf("yes\n");
        else printf("no\n");
}

Output depends on whether merge string constants optimization is enabled or disabled:

$ gcc a.c b.c -o a -fno-merge-constants; ./a
no
$ gcc a.c b.c -o a -fmerge-constants; ./a
yes


Got some interesting example in my OS course today. We analized some software mutex that could be damaged on optimization because the compiler does not know about the parallel execution.

The compiler can reorder statements that do not operate on dependent data. As I already statet in parallelized code this dependencie is hidden for the compiler so it could break. The example I gave would lead to some hard times in debugging as the threadsafety is broken and your code behaves unpredictable because of OS-scheduling issues and concurrent access errors.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜