开发者

A way to measure performance

Given Exercise 14 from 99 Haskell Problems:

(*) Duplicate the elements of a list.

Eg.:

*Main> dupli''' [1..10]
[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10]

I've implemented 4 solutions:

{-- my first attempt --}
dupli :: [a] -> [a]
dupli [] = []
dupli (x:xs) = replicate 2 x ++ dupli xs

{-- usin开发者_如何转开发g concatMap and replicate --}
dupli' :: [a] -> [a]
dupli' xs = concatMap (replicate 2) xs

{-- usign foldl --}
dupli'' :: [a] -> [a]
dupli'' xs = foldl (\acc x -> acc ++ [x,x]) [] xs

{-- using foldl 2 --}
dupli''' :: [a] -> [a]
dupli''' xs = reverse $ foldl (\acc x -> x:x:acc) [] xs

Still, I don't know how to really measure performance .

So what's the recommended function (from the above list) in terms of performance .

Any suggestions ?


These all seem more complicated (and/or less efficient) than they need to be. Why not just this:

dupli [] = []
dupli (x:xs) = x:x:(dupli xs)

Your last example is close to a good fold-based implementation, but you should use foldr, which will obviate the need to reverse the result:

dupli = foldr (\x xs -> x:x:xs) []

As for measuring performance, the "empirical approach" is profiling. As Haskell programs grow in size, they can get fairly hard to reason about in terms of runtime and space complexity, and profiling is your best bet. Also, a crude but often effective empirical approach when gauging the relative complexity of two functions is to simply compare how long they each take on some sufficiently large input; e.g. time how long length $ dupli [1..1000000] takes and compare it to dupli'', etc.

But for a program this small it shouldn't be too hard to figure out the runtime complexity of the algorithm based on your knowledge of the data structure(s) in question--in this case, lists. Here's a tip: any time you use concatenation (x ++ y), the runtime complexity is O(length x). If concatenation is used inside of a recursive algorithm operating on all the elements of a list of size n, you will essentially have an O(n ^2) algorithm. Both examples I gave, and your last example, are O(n), because the only operation used inside the recursive definition is (:), which is O(1).


As recommended you can use the criterion package. A good description is http://www.serpentine.com/blog/2009/09/29/criterion-a-new-benchmarking-library-for-haskell/.

To summarize it here and adapt it to your question, here are the steps. Install criterion with

cabal install criterion -fchart

And then add the following to your code

import Criterion.Main

l = [(1::Int)..1000]

main = defaultMain [ bench "1" $ nf dupli l
                   , bench "2" $ nf dupli' l 
                   , bench "3" $ nf dupli'' l
                   , bench "4" $ nf dupli''' l
                   ]

You need the nf in order to force the evaluation of the whole result list. Otherwise you'll get just the thunk for the computation.

After that compile and run

ghc -O --make dupli.hs
./dupli -t png -k png

and you get pretty graphs of the running times of the different functions.

It turns out that dupli''' is the fastest from your functions but the foldr version that pelotom listed beats everything.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜