开发者

Haskell http response result unreadable

import Network.URI
import Network.HTTP
import Network.Browser

get :: URI -> IO String
get uri = do
  let req = Request uri GET [] ""
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        writeFile "output开发者_开发技巧.txt" body

Here is the diff between haskell output and curl output

Haskell http response result unreadable


It's probably not a good idea to use String as the intermediate data type here, as it will cause character conversions both when reading the HTTP response, and when writing to the file. This can cause corruption if these conversions are nor consistent, as it would appear they are here.

Since you just want to copy the bytes directly, it's better to use a ByteString. I've chosen to use a lazy ByteString here, so that it does not have to be loaded into memory all at once, but can be streamed lazily into the file, just like with String.

import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L

get :: URI -> IO L.ByteString
get uri = do
  let req = Request uri GET [] L.empty
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        L.writeFile "output.txt" body

Fortunately, the functions in Network.Browser are overloaded so that the change to lazy bytestrings only involves changing the request body to L.empty, replacing writeFile with L.writeFile, as well as changing the type signature of the function.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜