开发者

Haskell - efficient equivalent of for loop?

I've been doing some experiments and here's something I found. Consider the following C program:

int main()
{
   for(int i = 0;
       i < 1000000;
       ++i)
   {}
}

and the following Haskell program:

import System.IO

loop :: Int -> IO ()
loop n = if 0 == n then return () else loop (n-1)

main = loop 1000000

Here is the output of time for the above C program:

real    0m0.003s
user    0m0.000s
sys 0m0.000s

...and for the Haskell program:

real    0m0.028s
user    0m0.027s
sys 0m0.000s

At first I thought that gcc detected an empty loop and o开发者_如何学运维ptimized it out, but after increasing the number of iterations, the running time of the program also increased. Here are the outputs of time for both of the programs, with the number of iterations set to 10000000:

C version

real    0m0.024s
user    0m0.023s
sys 0m0.000s

Haskell version

real    0m0.245s
user    0m0.247s
sys 0m0.000s

As you can see, the Haskell program is 10 times slower.

Question is: what is the efficient alternative to the for loop in Haskell? As we just saw, simple recursion slows the program down about 10 times (and this is probably with the tail recursion optimizations done).


Firstly, you'd translate your C code to this,

main = go 0
    where
        go :: Int -> IO ()
        go i | i < 1000000 = go (i+1)
             | otherwise   = return ()

which ghc optimizes to the empty program. It moves the final value into a register, compares against it, and returns ():

Main_zdwa_info: 
    cmpq    $1000000, %r14          # imm = 0xF4240
    movl    $1000000, %eax          # imm = 0xF4240
    cmovlq  %rax, %r14
    movq    (%rbp), %rax
    movl    $ghczmprim_GHCziUnit_Z0T_closure+1, %ebx
    jmpq    *%rax  # TAILCALL

which when run:

$ time ./A
./A  0.00s user 0.00s system 88% cpu 0.004 total

takes no time.


In general, though, GHC emits equivalent loops to e.g. GCC, for tail-recursive functions. In particular, you'll want to compile your numeric benchmarks with ghc -O2 -fllvm for best results. If you don't want your programs optimized, then ghc will happily execute exactly the program you specify, which, in this case, involves lots of redundant work that would be removed at higher optimization levels.

For more information on analyzing low-level performance of Haskell programs, check RWH ch25.


For looping constructs, you often have to write in the Worker/Wrapper style to help GHC spot optimizations rather than recur at the outer function level.

Grigory Javadyan has said in a comment that the original version gets optimized out at -O3, I'd expect this version gets detected at -O2:

import System.IO

loop :: Int -> IO ()
loop n = go n
  where
    go n | n <= 0 = return ()
    go n          = go (n-1)

main = loop 1000000
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜