Haskell Curl Help
Ok, I'm trying to wrap my head around IO in Haskell, and I figured I'd write a short little app dealing with web pages to do it. The snippet I'm getting tripped up at is (with apologies to bobince, though to be fair, I'm not trying to parse HTML here, just extract one or two values):
titleFromUrl url = do
(_, page) <- curlGetString url [CurlTimeout 60]
matchRegex (mkRegexWithOpts "<title>(.*?)</title>" False True) page
The above should take a URL in string form, scan the page it points to with matchRegex
, and return either Nothing
or Just [a]
, where a
is the matched (possibly multi-line) string. The frustrating thing is that when I try doing
Prelude> (_, page) <- curlGetString url [CurlTimeout 60]
Prelude>开发者_运维百科 matchRegex (mkRegexWithOpts "<title>(.*?)</title>" False True) page
in the interpreter, it does precisely what I want it to. When I try to load the same expression, and associated imports
from a file, it gives me a type inference error stating that it couldn't match expected type 'IO b' against inferred type 'Maybe [String]'
. This tells me I'm missing something small and fundamental, but I can't figure out what. I've tried explicitly casting page
to a string, but that's just programming by superstition (and it didn't work in any case).
Any hints?
Yeah, GHCi accepts any sort of value. You can say:
ghci> 4
4
ghci> print 4
4
But those two values (4
and print 4
) are clearly not equal. The magic GHC is doing is that if what you typed evaluates to an IO something
then it executes that action (and prints the result if something
is not ()
). If it doesn't, then it calls show
on the value and prints that. Anyway, this magic is not accessible from your program.
When you say:
do foo <- bar :: IO Int
baz
baz
is expected to be of type IO something
, and it's a type error otherwise. That would let you execute I/O and then return a pure value. You can check that with noting that desugaring the above yields:
bar >>= (\foo -> baz)
And
-- (specializing to IO for simplicity)
(>>=) :: IO a -> (a -> IO b) -> IO b
Therefore
bar :: IO a
foo :: a
baz :: IO b
The way to fix it is to turn your return value into an IO value using the return
function:
return :: a -> IO a -- (again specialized to IO)
Your code is then:
titleFromUrl url = do
(_, page) <- curlGetString url [CurlTimeout 60]
return $ matchRegex (mkRegexWithOpts "<title>(.*?)</title>" False True) page
For most of the discussion above, you can substitute any monad for IO
(eg. Maybe
, []
, ...) and it will still be true.
精彩评论