开发者

inlined functions still show up in the .prof file

I'm trying to figure out how to optimize开发者_StackOverflow社区 some code. Here it is:


{-# OPTIONS_GHC -funbox-strict-fields #-}

data Vec3 a = Vec3  !a !a !a

vx :: Vec3 a -> a
vx (Vec3 x _ _) = x
{-# SPECIALIZE INLINE vx :: Vec3 Double -> Double #-}

vy :: Vec3 a -> a
vy (Vec3 _ y _) = y
{-# SPECIALIZE INLINE vy :: Vec3 Double -> Double #-}

vz :: Vec3 a -> a
vz (Vec3 _ _ z) = z
{-# SPECIALIZE INLINE vz :: Vec3 Double -> Double #-}


dot :: (Num a) => Vec3 a -> Vec3 a -> a
dot u v = (vx u * vx v) + (vy u * vy v) + (vz u * vz v)
{-# SPECIALIZE INLINE dot :: Vec3 Double -> Vec3 Double -> Double #-}


type Vec3D = Vec3 Double

-- just make a bunch of vecs to measure performance

n = 1000000 :: Double

v1s = [Vec3 x y z | (x, y, z) <- zip3 [1 .. n] [2 .. n + 1] [3 .. n + 2]]
      :: [Vec3D]

v2s = [Vec3 x y z | (x, y, z) <- zip3 [3 .. n + 2] [2 .. n + 1] [1 .. n]]
      :: [Vec3D]


dots = zipWith dot v1s v2s  :: [Double]    
theMax = maximum dots :: Double
main :: IO ()
main = putStrLn $ "theMax: " ++ show theMax

When I compile with ghc 6.12.1 (ubuntu linux on an i486 machine)

ghc --make -O2 Vec.hs -prof -auto-all -fforce-recomp

and run

Vec +RTS -p

Looking at the Vec.prof file,


COST CENTRE                    MODULE               %time %alloc

v2s                            Main                  30.9   36.5
v1s                            Main                  27.9   31.3
dots                           Main                  27.2   27.0
CAF                            GHC.Float              4.4    5.2
vy                             Main                   3.7    0.0
vx                             Main                   2.9    0.0
theMax                         Main                   2.2    0.0

I see that the function vx and vy take a significant portion of the time.

Why is that? I thought that the SPECIALIZE INLINE pragma would make those functions go away.

When using a non-polymorphic

data Vec3D = Vec3D {vx, vy, vz :: !Double} deriving Show

the functions vx, vy, vz do not show as a cost center.


I suspect this is a side-effect of using -auto-all, which inhibits many optimizations GHC would normally perform, including inlining. I suspect the difference in your non-polymorphic version is actually due to vx, vy, and vz being defined via record syntax rather than because of polymorphism (but I could be wrong about this).

Instead of using -auto-all, try either adding an export list to the module and compiling with "-auto", or manually setting cost centers via SCC pragmas. I usually use SCC pragmas anyway because I often want to set them on let-bound functions, which -auto-all won't do.


I could not figure out how to make comments to the replies, so I'm making comments in this answer.

First, thanks for your answers.

FUZxxl: I tried -ddump-core, and got an error message that -ddump-core was an unrecognized flag. Perhaps you meant -ddump-simpl, which the book Real World Haskell recommended using, but I'm afraid I don't know how to read the output. I looked in the output file for "vx", etc, but never saw them. I guess I should learn how to read core. Are there any good guides for that?

John: According to GHC's flag reference documentation, if I'm reading it correctly, both -auto and -auto-all, are supposed add _scc_s to functions not marked INLINE. To see if -auto would work for me, I created another test case in which the Vec3 code was in a separate file/module, with Vec3(Vec3), vx, vy, vz, and dot exported. I imported this module into a Main.hs file. Compiling these with -auto, I still saw vx, vy, vz in the .prof file.

Re: your comment that the difference could be due to record syntax instead of polymorphism, I believe that the difference is more likely due to polymorphism, because when I defined

data Vec3 a = Vec3 {vx, vy, vz :: !a}

vx, vy and vz still showed up in the .prof file.

Tad

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜