GCC inline asm NOP loop not being unrolled at compile time
Venturing out of my usual VC++ realm into the world of GCC (via MINGW32). Trying to create a Windows PE that consists largely of NOPs, ala:
for(i = 0; i < 1000; i++)
{
asm("nop");
}
But either I'm using the wrong syntax or the compiler is optimising through them because those NOPs don't survive the compilation process.
I'm using the -O0 flag, otherwise defaults. Any ideas on ho开发者_StackOverflow社区w I can coax the compiler into leaving the NOPs intact?
A convenient way to get 1000 inline nop
s is to use the .rept
directive of the GNU assembler:
void thousand_nops(void) {
asm(".rept 1000 ; nop ; .endr");
}
Try on godbolt.
Are you expecting it to unroll the loop in to 1000 nop
s? I did a quick test with gcc
and I don't see the (one) nop
disappear:
xorl %eax, %eax
.p2align 4,,7
.L2:
#APP
nop
#NO_APP
addl $1, %eax
cmpl $1000, %eax
jne .L2
With gcc -S -O3 -funroll-all-loops
I see it unroll the loop 8 times (thus 8 nop
) but I think if you want 1000 it's going to be easiest to do:
#define NOP10() asm("nop;nop;nop;nop;nop;nop;nop;nop;nop;nop")
And then use NOP10(); ...
This recent question about looping to 1000 without conditionals resulted in a clever answer using template recursion which can actually be used to produce your 1000 nop
function without repeating asm("nop")
at all. There are some caveats: If you don't get the compiler to inline the function you will end up with a 1000-deep recursive stack of individual nop
functions. Also, gcc
's default template depth limit is 500 so you must specify a higher limit explicitly (see below, though you could simply avoid exceeding nop<500>()
).
// compile time recursion
template<int N> inline void nop()
{
nop<N-1>();
asm("nop");
}
template<> inline void nop<0>() { }
void nops()
{
nop<1000>();
}
Compiled with:
g++ -O2 -ftemplate-depth=1000 ctr.c
in addition to the answer by @BenJackson, it can recurse with way less depth by (binary) division.
template<unsigned int N> inline void nop()
{
nop<N/2>();
nop<N/2>();
nop<N-2*(N/2)>();
}
template<> inline void nop<0>() { }
template<> inline void nop<1>() { asm("nop"); }
精彩评论