开发者

Haskell Heap Issues with Parameter Passing Style

Here's a simple program that blows my heap to Kingdom Come:

intersect n k z s rs c
  | c == 23   = rs
  | x == y    = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
  | x < y     = intersect (n+1) k (z+1) s rs c
  | otherwise = intersect n (k+1) z s rs c
    where x = (2*n*n) + 4 * n
          y = (k * 开发者_如何学JAVAk + k )
          f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0 [] 0

main = do
  putStr (show p)

What the program does is calculate the intersection of two infinite series, stopping when it reaches 23 elements. But that's not important to me.

What's interesting is that as far as I can tell, there shouldn't be much here that is sitting on the heap. The function intersect is recursives with all recursions written as tail calls. State is accumulated in the arguments, and there is not much of it. 5 integers and a small list of tuples.

If I were a betting person, I would bet that somehow thunks are being built up in the arguments as I do the recursion, particularly on arguments that aren't evaluated on a given recursion. But that's just a wild hunch.

What's the true problem here? And how does one fix it?


If you have a problem with the heap, run the heap profiler, like so:

$ ghc -O2 --make A.hs -prof -auto-all -rtsopts -fforce-recomp
[1 of 1] Compiling Main             ( A.hs, A.o )
Linking A.exe ...

Which when run:

$ ./A.exe +RTS -M1G -hy

Produces an A.hp output file:

$ hp2ps -c A.hp

Like so:

Haskell Heap Issues with Parameter Passing Style

So your heap is full of Integer, which indicates some problem in the accumulating parameters of your functions -- where all the Integers are.

Modifying the function so that it is strict in the lazy Integer arguments (based on the fact you never inspect their value), like so:

{-# LANGUAGE BangPatterns #-}

intersect n k !z !s rs c
  | c == 23   = rs
  | x == y    = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
  | x < y     = intersect (n+1) k (z+1) s rs c
  | otherwise = intersect n (k+1) z s rs c
    where x = (2*n*n) + 4 * n
          y = (k * k + k )
          f = (z, (x `div` 2), (z+s))

p = intersect 1 1 1 0 [] 0

main = do
  putStr (show p)

And your program now runs in constant space with the list of arguments you're producing (though doesn't terminate for c == 23 in any reasonable time).

Haskell Heap Issues with Parameter Passing Style


If it is OK to get the resulting list reversed, you can take advantage of Haskell's laziness and return the list as it is computed, instead of passing it recursively as an accumulating argument. Not only does this let you consume and print the list as it is being computed (thereby eliminating one space leak right there), you can also factor out the decision about how many elements you want from intersect:

{-# LANGUAGE BangPatterns #-}

intersect n k !z s
  | x == y    = f : intersect (n+1) (k+1) (z+1) (z+s)
  | x < y     = intersect (n+1) k (z+1) s
  | otherwise = intersect n (k+1) z s
    where x = (2*n*n) + 4 * n
          y = (k * k + k )
          f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0

main = do
  putStrLn (unlines (map show (take 23 p)))

As Don noted, we need to be careful so that accumulating arguments evaluate timely instead of building up big thunks. By making the argument z strict we ensure that all arguments will be demanded.

By outputting one element per line, we can watch the result being produced:

$ ghc -O2 intersect.hs && ./intersect
[1 of 1] Compiling Main             ( intersect.hs, intersect.o )
Linking intersect ...
(1,3,1)
(3,15,4)
(10,120,14)
(22,528,36)
(63,4095,99)
(133,17955,232)
(372,139128,604)
(780,609960,1384)
...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜