Tail call recursion

2023-04-02 07:07 问答作者：

I'm implementing a function as following:

void Add(list* node)
{
    if(this->next == NULL)
        this->next = node;
    else
        this->next->Add(node);
}

As it seems Add is going to be tail-called in every step of the recursion.

I could also implement it as:

void Add(list *node)
{
    list *curr = this;
    while(c开发者_如何学JAVAurr->next != NULL) curr = curr->next;
    curr->next = node;
}

This will not use recursion at all.

Which version of this is better? (in stack size or speed)

Please don't give the "Why don't use STL/Boost/whatever?" comments/answers.

They probably will be the same performance-wise, since the compiler will probably optimise them into the exact same code.

However, if you compile on Debug settings, the compiler will not optimise for tail-recursion, so if the list is long enough, you can get a stack overflow. There is also the (very small) possibility that a bad compiler won't optimise the recursive version for tail-recursion. There is no risk of that in the iterative version.

Pick whichever one is clearer and easier for you to maintain taking the possibility of non-optimisation into account.

I tried it out, making three files to test your code:

node.hh:

struct list {
  list *next;
  void Add(list *);
};

tail.cc:

#include "node.hh"

void list::Add(list* node)
{
    if(!this->next)
        this->next = node;
    else
        this->next->Add(node);
}

loop.cc:

#include "node.hh"

void list::Add(list *node)
{
    list *curr = this;
    while(curr->next) curr = curr->next;
    curr->next = node;
}

Compiled both files with G++ 4.3 for IA32, with -O3 and -S to give assembly output rather than object files

Results:

tail.s:

_ZN4list3AddEPS_:
.LFB0:
        .cfi_startproc
        .cfi_personality 0x0,__gxx_personality_v0
        pushl   %ebp
        .cfi_def_cfa_offset 8
        movl    %esp, %ebp
        .cfi_offset 5, -8
        .cfi_def_cfa_register 5
        movl    8(%ebp), %eax
        .p2align 4,,7
        .p2align 3
.L2:
        movl    %eax, %edx
        movl    (%eax), %eax
        testl   %eax, %eax
        jne     .L2
        movl    12(%ebp), %eax
        movl    %eax, (%edx)
        popl    %ebp
        ret
        .cfi_endproc

loop.s:

_ZN4list3AddEPS_:
.LFB0:
        .cfi_startproc
        .cfi_personality 0x0,__gxx_personality_v0
        pushl   %ebp
        .cfi_def_cfa_offset 8
        movl    %esp, %ebp
        .cfi_offset 5, -8
        .cfi_def_cfa_register 5
        movl    8(%ebp), %edx
        jmp     .L3
        .p2align 4,,7
        .p2align 3
.L6:
        movl    %eax, %edx
.L3:
        movl    (%edx), %eax
        testl   %eax, %eax
        jne     .L6
        movl    12(%ebp), %eax
        movl    %eax, (%edx)
        popl    %ebp
        ret
        .cfi_endproc

Conclusion: The output is substantially similar enough (the core loop/recursion becomes movl, movl, testl, jne in both) that it really isn't worth worrying about. There's one less unconditional jump in the recursive version, although I wouldn't want to bet either way which is faster if it's even measurable at all. Pick which ever is most natural to express the algorithm at hand. Even if you later decide that was a bad choice it's not too hard to switch.

Adding -g to the compilation doesn't change the actual implementation with g++ either, although there is the added complication that setting break points no longer behaves like you would expect it to - break points on the tail call line gets hit at most once (but not at all for a 1 element list) in my tests with GDB, regardless of how deep the recursion actually goes.

Timings:

Out of curiosity I ran some timings with the same variant of g++. I used:

#include <cstring>
#include "node.hh"

static const unsigned int size = 2048;
static const unsigned int count = 10000;

int main() {
   list nodes[size];
   for (unsigned int i = 0; i < count; ++i) {
      std::memset(nodes, 0, sizeof(nodes));
      for (unsigned int j = 1; j < size; ++j) {
        nodes[0].Add(&nodes[j]);
      }
   }
}

This was run 200 times, with each of the loop and the tail call versions. The results with this compiler on this platform were fairly conclusive. Tail had a mean of 40.52 seconds whereas lop had a mean of 66.93. (The standard deviations were 0.45 and 0.47 respectively).

Tail call recursion

So I certainly wouldn't be scared of using tail call recursion if it seems the nicer way of expressing the algorithm, but I probably wouldn't go out of my way to use it either, since I suspect that these timing observations would most likely vary from platform/compiler (versions).

继续阅读：tail-recursion

Tail call recursion

node.hh:

tail.cc:

loop.cc:

Results:

tail.s:

loop.s:

Timings:

更多精彩内容

精彩评论

最新问答

北医三院三代试管养囊一次费用是多少？贵不贵？？

下周全市中小学校恢复线下教学，如何让孩子收心准备开学？？

飞利浦液晶电视,不小心按了TV,现实无信号,要怎么样才能返回电视...？

如何治疗输卵管阻塞？

理光短焦家用投影仪pjk366蓝光3d超清家用会议教学家用投影机如何样？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

node.hh:

tail.cc:

loop.cc:

Results:

tail.s:

loop.s:

Timings:

更多精彩内容

精彩评论

最新问答

北医三院三代试管养囊一次费用是多少？贵不贵？？

下周全市中小学校恢复线下教学，如何让孩子收心准备开学？？

飞利浦液晶电视,不小心按了TV,现实无信号,要怎么样才能返回电视...？

如何治疗输卵管阻塞？

理光短焦家用投影仪pjk366蓝光3d超清家用会议教学家用投影机 如何样？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

理光短焦家用投影仪pjk366蓝光3d超清家用会议教学家用投影机如何样？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？