Can compiler optimizations, like ghc -O2, change the order (time or storage) of a program?

2023-04-10 12:16 问答作者：

I've got the feeling that the answer is yes, and that's not restricted to Haskell. For example, tail-call optimization changes memory requirements from O(n) to O(l), right?

My precis开发者_C百科e concern is: in the Haskell context, what is expected to understand about compiler optimizations when reasoning about performance and size of a program?

In Scheme, you can take some optimizations for granted, like TCO, given that you are using an interpreter/compiler that conforms to the specification.

Yes, in particular GHC performs strictness analysis, which can drastically reduce space usage of a program with unintended laziness from O(n) to O(1).

For example, consider this trivial program:

$ cat LazySum.hs
main = print $ sum [1..100000]

Since sum does not assume that the addition operator is strict, (it might be used with a Num instance for which (+) is lazy), it will cause a large number of thunks to be allocated. Without optimizations enabled, strictness analysis is not performed.

$ ghc --make LazySum.hs -rtsopts -fforce-recomp
[1 of 1] Compiling Main             ( LazySum.hs, LazySum.o )
Linking LazySum ...
$ ./LazySum +RTS -s
./LazySum +RTS -s 
5000050000
      22,047,576 bytes allocated in the heap
      18,365,440 bytes copied during GC
       6,348,584 bytes maximum residency (4 sample(s))
       3,133,528 bytes maximum slop
              15 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:    23 collections,     0 parallel,  0.04s,  0.03s elapsed
  Generation 1:     4 collections,     0 parallel,  0.01s,  0.02s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.01s  (  0.03s elapsed)
  GC    time    0.05s  (  0.04s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.06s  (  0.07s elapsed)

  %GC time      83.3%  (58.0% elapsed)

  Alloc rate    2,204,757,600 bytes per MUT second

  Productivity  16.7% of total user, 13.7% of total elapsed

However, if we compile with optimizations enabled, the strictness analyzer will determine that since we're using the addition operator for Integer, which is known to be strict, the compiler knows that it is safe to evaluate the thunks ahead of time, and so the program runs in constant space.

$ ghc --make -O2 LazySum.hs -rtsopts -fforce-recomp
[1 of 1] Compiling Main             ( LazySum.hs, LazySum.o )
Linking LazySum ...
$ ./LazySum +RTS -s
./LazySum +RTS -s 
5000050000
       9,702,512 bytes allocated in the heap
           8,112 bytes copied during GC
          27,792 bytes maximum residency (1 sample(s))
          20,320 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:    18 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.01s  (  0.02s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.01s  (  0.02s elapsed)

  %GC time       0.0%  (2.9% elapsed)

  Alloc rate    970,251,200 bytes per MUT second

  Productivity 100.0% of total user, 55.0% of total elapsed

Note that we can get constant space even without optimizations, if we add the strictness ourselves:

$ cat StrictSum.hs 
import Data.List (foldl')
main = print $ foldl' (+) 0 [1..100000]
$ ghc --make StrictSum.hs -rtsopts -fforce-recomp
[1 of 1] Compiling Main             ( StrictSum.hs, StrictSum.o )
Linking StrictSum ...
$ ./StrictSum +RTS -s
./StrictSum +RTS -s 
5000050000
       9,702,664 bytes allocated in the heap
           8,144 bytes copied during GC
          27,808 bytes maximum residency (1 sample(s))
          20,304 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:    18 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.00s  (  0.01s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.00s  (  0.01s elapsed)

  %GC time       0.0%  (2.1% elapsed)

  Alloc rate    9,702,664,000,000 bytes per MUT second

  Productivity 100.0% of total user, 0.0% of total elapsed

Strictness tends to be a bigger issue than tail calls, which aren't really a useful concept in Haskell, because of Haskell's evaluation model.

继续阅读：compiler-optimization ghc haskell

Can compiler optimizations, like ghc -O2, change the order (time or storage) of a program?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？