开发者

efficient evaluation of max(a,b) inside loop c.f. branch prediction?

What is an efficient way to calculate the maximum of 2 floats inside a for loop in C without using a logic statement which might stall the pipeline such as a > b ? a : b?

I am working with huge 3D arr开发者_StackOverflow社区ays and have tons of loop iterations.


Check what your compiler outputs, it's probably "optimal" already. For instance,

float foo(float a, float b)
{
    return (a>b?a:b);
}

Compiled with GCC 4.5, -O3, generates this assembly on x86_64:

Disassembly of section .text:

0000000000000000 <foo>:
   0:   f3 0f 5f c1             maxss  %xmm1,%xmm0
   4:   c3                      retq   

i.e. the compiler knows a lot about the instruction set you're targeting, and the semantics of your code. Let it do its job.


Well, I don't think this is faster than using branching but this seems to work:

#include <stdio.h>

#define FasI(f)  (*((int *) &(f)))
#define FasUI(f) (*((unsigned int *) &(f)))

#define lt0(f)  (FasUI(f) > 0x80000000U)
#define le0(f)  (FasI(f) <= 0)
#define gt0(f)  (FasI(f) > 0)
#define ge0(f)  (FasUI(f) <= 0x80000000U)


int main()
{
    float a=11.0,b=4.6;
    float x=a-b,y=b-a;

    printf("%f\n",lt0((y))*a+lt0((x))*b);
    return 0;
}

The defines were taken from The Aggregate Magic Algorithms

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜