开发者

Haskell, Monads, Stack Space, Laziness -- how to structure code to be lazy?

A contrived example, but the below code demonstrates a class of problems I keep running into while learning Haskell.开发者_StackOverflow中文版

import Control.Monad.Error
import Data.Char (isDigit)

countDigitsForList [] = return []
countDigitsForList (x:xs) = do
    q  <- countDigits x
    qs <- countDigitsForList xs
    return (q:qs)

countDigits x = do
    if all isDigit x
        then return $ length x
        else throwError $ "Bad number: " ++ x

t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]

t1 gives me the right answer and t2 correctly identifies the error.

Seems to me that, for a sufficiently long list, this code is going to run out of stack space because it runs inside of a monad and at each step it tries to process the rest of the list before returning the result.

An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.

How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?


How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?

You can't make this function process the list lazily, monads or no. Here's a direct translation of countDigitsForList to use pattern matching instead of do notation:

countDigitsForList [] = return []
countDigitsForList (x:xs) = case countDigits x of
    Left e  -> Left e
    Right q -> case countDigitsForList xs of
                   Left e   -> Left e
                   Right qs -> Right (q:qs)

It should be easier to see here that, because a Left at any point in the list makes the whole thing return that value, in order to determine the outermost constructor of the result, the entire list must be traversed and processed; likewise for processing each element. Because the final result potentially depends on the last character in the last string, this function as written is inherently strict, much like summing a list of numbers.

Given that, the thing to do is ensure that the function is strict enough to avoid building up a huge unevaluated expression. A good place to start for information on that is discussions on the difference between foldr, foldl and foldl'.

An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.

Both are unnecessary when you can instead generate, process, and consume a list lazily; the simplest example here being map. For a function where that's not possible, strictly-evaluated tail recursion is precisely what you want.


camccann is right that the function is inherently strict. But that doesn't mean that it can't run in constant stack!

countDigitsForList xss = go xss []
    where go (x:xs) acc = case countDigits x of
                             Left e -> Left e
                             Right q -> go xs (q:acc)
          go [] acc = reverse acc

This accumulating parameter version is a partial cps transform of camccann's code, and I bet that you could get the same result by working over a cps-transformed either monad as well.

Edited to take into account jwodder's correction regarding reverse. oops. As John L notes an implicit or explicit difference list would work as well...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜