开发者

C-like callback handling: which algorithm preforms faster?

I have an array of call backs like this void (*callbacks[n])(void* sender) and I'm wondering which one of these codes will preform faster :

//Method A
void nullcallback(void* sender){};

void callbacka(void* sender) 
{
    printf("Hello ");
}

void callbackb(void* sender)
{
    printf("world\n");
}

int main()
{
    void (*callbacks[5])(void* sender);
    unsigned i;
    for (i=0;i<5;++i)
        callbacks[i] = nullcallback;
    callbacks[2] = callbacka;
    callbacks[4] = callbackb;
    for (i=0;i<5;++i)
        callbacks[i](NULL);
};

or

//Method B
void callbacka(void* sender) 
{
    printf("Hello ");
}

void callbackb(void* sender)
{
    printf("world\n");
}

int main()
{
    void (*callbacks[5])(void* sender);
    unsigned i;开发者_如何学Go
    for (i=0;i<5;++i)
        callbacks[i] = NULL;
    callbacks[2] = callbacka;
    callbacks[4] = callbackb;
    for (i=0;i<5;++i)
        if (callbacks[i])
            callbacks[i](NULL);
};

some conditions:

  • Does it matter if I know most of my callbacks are valid or not?
  • Does it make a difference if I'm compiling my code using C or C++ compiler?
  • Does the target platform (windows, linux, mac, iOS, android) change any thing in the results? (the whole reason for this callback array is to manage callbacks in a game)


You'd have to look into the assembler code for that. On my platform (gcc, 32bit) I found that the compiler is not able to optimize the call to nullcallback out. But if I improve your method A to the following

int main(void) {
  static void (*const callbacks[5])(void* sender) = {
    [0] = nullcallback,
    [1] = nullcallback,
    [2] = callbacka,
    [3] = nullcallback,
    [4] = callbackb,
  };
  for (unsigned i=0;i<5;++i)
        callbacks[i](0);
};

the compiler is able to unroll the loop and optimize the calls the result is just

    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $16, %esp
    movl    $0, (%esp)
    call    callbacka
    movl    $0, (%esp)
    call    callbackb
    xorl    %eax, %eax
    leave
    ret
    .size   main, .-main


This totally depends on your actual situation. If possible I would prefer methode A, because it is simply easier to read and produce cleaner code, in particular if your function has a return value:

ret = callbacks[UPDATE_EVENT](sender);
// is nicer then
if (callbacks[UPDATE_EVENT])
    ret = callbacks[UPDATE_EVENT](sender);
else
    ret = 0;

Of course methode A becomes tedouis when you have not only one function signature but let's say 100 different signature. And for each you have to write a null function.

For the performance consideration it depends if the nullcallback() is a rare case or not. If it is rare, methode A is obviously faster. If not methode B could be slightly faster, but that depends on many factors: which platform you use, how many arguments your functions have, etc. But in any case if your callbacks are doing "real work", ie. not only some simple calculations, it shouldn't matter at all.

Where your methode B could really be faster is when you not only call the callback for one sender but for very many:

extern void *senders[SENDERS_COUNT]; // SENDERS_COUNT is a large number

if (callbacks[UPDATE_EVENT])
{
    for (int i = 0; i < SENDERS_COUNT; i++)
        callbacks[UPDATE_EVENT](senders[i]);
} 

Here the entire loop is skipped when there is no valid callback. This tweak can also be done with methode A if the nullcallback() address is known, ie. not defined in some module only.


You could optimize your code further by simply zero-initializing the array to start with like:

void (*callbacks[5])(void* sender) = { 0 };

Then you've completely eliminated the need for your for-loop to set each pointer to NULL. You now just have to make assignments for callbacka and callbackb.


For the general case method B is preferred, but for function pointer LUTs when NULL is the exception than method A is microscopically faster.

The primary example is Linux system call table, NULL calls should only occur in rare circumstances when running binaries built on newer systems, or programmer error. Systems calls occur often enough that nanosecond or even picosecond improvements can help.

Other instances it may prove worthy is for opcode LUTs inside emulators such as MAME.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜