开发者

Fully featured CSV parser for Haskell?

Can anybody recommend a way to parse CSV files with options to:

  • set cells/fields separator
  • set end of record/row terminator
  • set quote-character for fields
  • support of UTF-8 strings
  • ability to write in-memory CSV structure back to a file

I did try Text.CSV but it's very simple and lacks most of the above features. Is there some more advanced CSV parsing module or do I have to write开发者_高级运维 it "from scratch" i.e. using Text.ParserCombinators? I do not intend to reinvent a wheel.

Take care.


I can't recommend a ready-to-go, packaged-up CSV parser for Haskell, but I remember that the book Real-World Haskell by Bryan O'Sullivan et al. contains a chapter on Parsec, which the authors demonstrate by creating a CSV parser.

The relevant chapter 16: Using Parsec is available online; check the section titled Extended Example: Full CSV Parser.


This is an old thread, but both csv-conduit and cassava have most, if not all -- not sure about re-writing to the file -- of the features you're looking for.


A quick search on Hackage finds Data.Spreadsheet, which does have customizable quote and separator.


There is the Data.Csv module on hackage. In case your distribution does not provide a package for it you can install it via cabal, e.g.

$ cabal install cassava

It can read and write (i.e. decode/encode) records from/to CSV files.

You can set the field separator like this:

import Data.Csv
import Data.Char -- ord
import qualified Data.ByteString.Lazy.Char8 as B

enc_opts = defaultEncodeOptions {
  encDelimiter = fromIntegral $ ord '\t'
}

write_csv vector = do
  B.putStr $ encodeWith enc_opts vector

Currently, Data.Csv does not offer other encode/decode options. There are function variants for working with a header row. As is, lines are terminated with CRLF, double-quotes are used for quoting and as text-encoding UTF8 is assumed. Double-quotes in values are quoted with a back-slash and quoting is omitted where it is 'not necessary'.


Cassava works in memory and is very simple library e.g.

encode [("John" :: Text, 27), ("Jane", 28)]
"John,27\r\nJane,28\r\n"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜