dealing with IO vs pure code in haskell
I'm writing a shell script (my 1st non-example in haskell) which is supposed to list a directory, get every file size, do some string manipulation (pure code) and then rename some files. I'm not sure what i'm doing wrong, so 2 questions:
- How should i arrange the code in such program?
- I have a specific issue, i get the following error, what am i doing wrong?
error:
Couldn't match expected type `[FilePath]'
against inferred type `IO [FilePath]'
In the second argument of `mapM', namely `fileNames'
In a stmt of a 'do' expression:
files <- (mapM getFileNameAndSize fileNames)
In the expression:
do { fileNames <- getDirectoryContents;
files <- (mapM getFileNameAndSize fileNames);
sortBy cmpFilesBySize files }
code:
getFileNameAndSize fname = do (fname, (withFile fname ReadMode hFileSize))
getFilesWithSizes = do
fileNames <- getDirectoryContents
files开发者_运维问答 <- (mapM getFileNameAndSize fileNames)
sortBy cmpFilesBySize files
Your second, specific, problem is with the types of your functions. However, your first issue (not really a type thing) is the do
statement in getFileNameAndSize
. While do
is used with monads, it's not a monadic panacea; it's actually implemented as some simple translation rules. The Cliff's Notes version (which isn't exactly right, thanks to some details involving error handling, but is close enough) is:
do a
≡a
do a ; b ; c ...
≡a >> do b ; c ...
do x <- a ; b ; c ...
≡a >>= \x -> do b ; c ...
In other words, getFileNameAndSize
is equivalent to the version without the do
block, and so you can get rid of the do
. This leaves you with
getFileNameAndSize fname = (fname, withFile fname ReadMode hFileSize)
We can find the type for this: since fname
is the first argument to withFile
, it has type FilePath
; and hFileSize
returns an IO Integer
, so that's the type of withFile ...
. Thus, we have getFileNameAndSize :: FilePath -> (FilePath, IO Integer)
. This may or may not be what you want; you might instead want FilePath -> IO (FilePath,Integer)
. To change it, you can write any of
getFileNameAndSize_do fname = do size <- withFile fname ReadMode hFileSize
return (fname, size)
getFileNameAndSize_fmap fname = fmap ((,) fname) $
withFile fname ReadMode hFileSize
-- With `import Control.Applicative ((<$>))`, which is a synonym for fmap.
getFileNameAndSize_fmap2 fname = ((,) fname)
<$> withFile fname ReadMode hFileSize
-- With {-# LANGUAGE TupleSections #-} at the top of the file
getFileNameAndSize_ts fname = (fname,) <$> withFile fname ReadMode hFileSize
Next, as KennyTM pointed out, you have fileNames <- getDirectoryContents
; since getDirectoryContents
has type FilePath -> IO FilePath
, you need to give it an argument. (e.g. getFilesWithSizes dir = do fileNames <- getDirectoryContents dir ...
). This is probably just a simple oversight.
Mext, we come to the heart of your error: files <- (mapM getFileNameAndSize fileNames)
. I'm not sure why it gives you the precise error it does, but I can tell you what's wrong. Remember what we know about getFileNameAndSize
. In your code, it returns a (FilePath, IO Integer)
. However, mapM
is of type Monad m => (a -> m b) -> [a] -> m [b]
, and so mapM getFileNameAndSize
is ill-typed. You want getFileNameAndSize :: FilePath -> IO (FilePath,Integer)
, like I implemented above.
Finally, we need to fix your last line. First of all, although you don't give it to us, cmpFilesBySize
is presumably a function of type (FilePath, Integer) -> (FilePath, Integer) -> Ordering
, comparing on the second element. This is really simple, though: using Data.Ord.comparing :: Ord a => (b -> a) -> b -> b -> Ordering
, you can write this comparing snd
, which has type Ord b => (a, b) -> (a, b) -> Ordering
. Second, you need to return your result wrapped up in the IO monad rather than just as a plain list; the function return :: Monad m => a -> m a
will do the trick.
Thus, putting this all together, you'll get
import System.IO (FilePath, withFile, IOMode(ReadMode), hFileSize)
import System.Directory (getDirectoryContents)
import Control.Applicative ((<$>))
import Data.List (sortBy)
import Data.Ord (comparing)
getFileNameAndSize :: FilePath -> IO (FilePath, Integer)
getFileNameAndSize fname = ((,) fname) <$> withFile fname ReadMode hFileSize
getFilesWithSizes :: FilePath -> IO [(FilePath,Integer)]
getFilesWithSizes dir = do fileNames <- getDirectoryContents dir
files <- mapM getFileNameAndSize fileNames
return $ sortBy (comparing snd) files
This is all well and good, and will work fine. However, I might write it slightly differently. My version would probably look like this:
{-# LANGUAGE TupleSections #-}
import System.IO (FilePath, withFile, IOMode(ReadMode), hFileSize)
import System.Directory (getDirectoryContents)
import Control.Applicative ((<$>))
import Control.Monad ((<=<))
import Data.List (sortBy)
import Data.Ord (comparing)
preservingF :: Functor f => (a -> f b) -> a -> f (a,b)
preservingF f x = (x,) <$> f x
-- Or liftM2 (<$>) (,), but I am not entirely sure why.
fileSize :: FilePath -> IO Integer
fileSize fname = withFile fname ReadMode hFileSize
getFilesWithSizes :: FilePath -> IO [(FilePath,Integer)]
getFilesWithSizes = return . sortBy (comparing snd)
<=< mapM (preservingF fileSize)
<=< getDirectoryContents
(<=<
is the monadic equivalent of .
, the function composition operator.) First off: yes, my version is longer. However, I'd probably already have preservingF
defined somewhere, making the two equivalent in length.* (I might even inline fileSize
if it weren't used elsewhere.) Second, I like this version better because it involves chaining together simpler pure functions we've already written. While your version is similar, mine (I feel) is more streamlined and makes this aspect of things clearer.
So this is a bit of an answer to your first question of how to structure these things. I personally tend to lock my IO down into as few functions as possible—only functions which need to touch the outside world directly (e.g. main
and anything which interacts with a file) get an IO
. Everything else is an ordinary pure function (and is only monadic if it's monadic for general reasons, along the lines of preservingF
). I then arrange things so that main
, etc., are just compositions and chains of pure functions: main
gets some values from IO
-land; then it calls pure functions to fold, spindle, and mutilate the date; then it gets more IO
values; then it operates more; etc. The idea is to separate the two domains as much as possible, so that the more compositional non-IO
code is always free, and the black-box IO
is only done precisely where necessary.
Operators like <=<
really help with writing code in this style, as they let you operate on functions which interact with monadic values (such as the IO
-world) just as you would operate on normal functions. You should also look at Control.Applicative's function <$> liftedArg1 <*> liftedArg2 <*> ...
notation, which lets you apply ordinary functions to any number of monadic (really Applicative
) arguments. This is really nice for getting rid of spurious <-
s and just chaining pure functions over monadic code.
*: I feel like preservingF
, or at least its sibling preserving :: (a -> b) -> a -> (a,b)
, should be in a package somewhere, but I've been unable to find either.
getDirectoryContents
is a function. You should supply an argument to it, e.g.
fileNames <- getDirectoryContents "/usr/bin"
Also, the type of getFileNameAndSize
is FilePath -> (FilePath, IO Integer)
, as you can check from ghci:
Prelude> :m + System.IO
Prelude System.IO> let getFileNameAndSize fname = do (fname, (withFile fname ReadMode hFileSize))
Prelude System.IO> :t getFileNameAndSize
getFileNameAndSize :: FilePath -> (FilePath, IO Integer)
But mapM
requires the input function to return an IO stuff
:
Prelude System.IO> :t mapM
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
-- # ^^^^^^^^
You should change its type to FilePath -> IO (FilePath, Integer)
to match the type.
getFileNameAndSize fname = do
fsize <- withFile fname ReadMode hFileSize
return (fname, fsize)
精彩评论