Reasoning laziness

2023-02-16 11:23 问答作者：

I have the following snippet:

import qualified Data.Vector as V
import qualified Data.ByteString.Lazy as BL
import System.Environment
import Data.Word
import qualified Data.List.Stream as S

histogram ::  [W开发者_运维百科ord8] -> V.Vector Int
histogram c = V.accum (+) (V.replicate 256 0) $ S.zip (map fromIntegral c) (S.repeat 1)

mkHistogram file = do
  hist <- (histogram . BL.unpack) `fmap` BL.readFile file
  print hist

I see it like this: Nothing is done until printing. When printing the thunks are unwinded by first unpacking, then mapping fromIntegral one Word8 at a time. Each of these word8's are zipped with 1, again one value at a time. This tuples are then taken by the accumulator function which updates the array, one tuple/Word8 at a time. Then we move to the next thunk and repeat until no more content left.

This would allow for creating histograms in constant memory, but alas this is not happening, but instead it crashes with stack overflow. If I try to profile it, I see it running to the end, but taking memory a lot (300-500 Mb for a 2.5 Mb file). Memory is obtained linearly until the end until it can be released, forming a "nice" triangular graph.

Where did my reasoning go wrong and what steps should I take to make this run in constant memory?

I believe the problem is that Data.Vector is not strict in its elements. So although your reasoning is right, when accumulating the histogram your thunks looks like:

<1+(1+(1+0)) (1+(1+0)) 0 0 (1+(1+(1+(1+0)))) ... >

Rather than

<3 2 0 0 4 ...>

And only when you print are those sums computed. I don't see a strict accum function in the docs (shame), and there isn't any place to hook in a seq. One way out of this predicament may be to use Data.Vector.Unboxed instead, since unboxed types are unlifted aka strict. Maybe you could request a strict accum function with your example as a use case.

继续阅读：haskell lazy-evaluation

Reasoning laziness

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？