Confusion regarding laziness
I have a function
myLength = foldl (\ x _ -> x + 1) 0
which fails with stack overflow with input around 10^6 elements (myLength [1..1000000] fails). I believe that is due to the thunk build up since when I replace foldl with foldl', it works. So far so good.
But now I have another function to reverse a list :
myReverse = foldl (\ acc x -> x : acc) []
which uses th开发者_JAVA百科e lazy version foldl (instead of foldl')
When I do
myLength . myReverse $ [1..1000000]
.
This time it works fine. I fail to understand why foldl works for the later case and not for former?
To clarify here myLength uses foldl' while myReverse uses foldl
Here's my best guess, though I'm no expert on Haskell internals (yet).
While building the thunk, Haskell allocates all the intermediate accumulator variables on the heap.
When performing the addition as in myLength
, it needs to use the stack for intermediate variables. See this page. Excerpt:
The problem starts when we finally evaluate z1000000:
Note that z1000000 = z999999 + 1000000. So 1000000 is pushed on the stack. Then z999999 is evaluated.
Note that z999999 = z999998 + 999999. So 999999 is pushed on the stack. Then z999998 is evaluated:
Note that z999998 = z999997 + 999998. So 999998 is pushed on the stack. Then z999997 is evaluated:
However, when performing list construction, here's what I think happens (this is where the guesswork begins):
When evaluating z1000000:
Note that z1000000 = 1000000 : z999999. So 1000000 is stored inside z1000000, along with a link (pointer) to z999999. Then z999999 is evaluated.
Note that z999999 = 999999 : z999998. So 999999 is stored inside z999999, along with a link to z999998. Then z999998 is evaluated.
etc.
Note that z999999, z999998 etc. changing from a not-yet-evaluated expression into a single list item is an everyday Haskell thing :)
Since z1000000, z999999, z999998, etc. are all on the heap, these operations don't use any stack space. QED.
精彩评论