Regex & String Libraries in Haskell

2023-01-29 17:29 问答作者：

I'm trying to introduce Haskell into my daily life by using it to write incidental scripts and such.

readProcess is 开发者_开发技巧handy for getting the results of exterior commands, but I find myself searching when it comes to processing the String results. I'm coming from ruby where regexes are first-class, so I'm used to having them as a tool.

Any libraries I should read up on to do string processing in haskell? Searching for matching lines, pulling out matching regions of a string, and such?

I found this to be a good starting point: http://www.serpentine.com/blog/2007/02/27/a-haskell-regular-expression-tutorial/ It only covers the basics, no advanced topics, but it's great to get started IMHO.

Things to note:

Regexes in haskell are different in that they have overloaded return types. This means that you can pull many different kinds of thing out of a regex match. (Bool, String, [String], etc...) Depending on the return type you use, it will give you back a different kind of answer (whether or not the regex matched, the test of the match, all matching subgroups, etc..) This is done using some fairly complex typeclass voodoo. The above link demonstrates the basic kinds, a more complete list is here
There are actually multiple standard modules in haskell that provide regex support (strange but true). The tutorial above shows the POSIX module, because it comes standard in haskell. If you have cabal, you can also pretty easily install other regex modules and use those instead. There's a pcre binding (regex-pcre), as well as some packages that work via DFAs (regex-dfa, among others). Install using a command like: cabal install regex-pcre and you should be good to go.
- (The modules have a standardized interface, the difference is mainly in the implementation and the regex flavor)
There IS a regex object in haskell, but you don't really need it to use the =~ or =~~ match operators. (Just use a string, conversion happens automatically). If your task is complicated enough that you want a first class parsing object, consider looking into Parsec as has been mentioned in other answers.

DISCLAIMER: I only really user pcre, myself, so I don't really know much about the other packages.

When I was first teaching myself Haskell I found that learning to use a parser combinator library for string processing was a fantastic investment. They can do everything regular expressions can do, and much more, and writing combinator parsers is a great way to build up intuitions about type classes like monads, applicative functors, etc.

I tend to use Attoparsec these days, but Parsec is probably a better starting point because it's more widely documented and discussed, provides nicer error messages, etc.

A good introduction to regular expressions is to be found in Realworld Haskell

Update: On a side note, for command-processing and pipes and such, checkout HSH.

There are plenty of great regex libs in Haskell, but we have better tools. Let's stick with standard Haskell Strings for now (i.e. lists of Char). The basics are all in Data.List -- http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.3.0.0/Data-List.html. You have lines, unlines, words, unwords, takewhile, dropwhile, etc.etc. Also isPrefixOf and isInfixOf, etc.

You may end up writing your own recursive functions fairly directly, but that's a breeze too. The only really missing operations are splitting ones, for which you can use brent's excellent package: http://hackage.haskell.org/package/split

Fundamentally, the notion is that you want to do incremental processing of streams of characters.

Not everything is as efficient as possible, especially since the string representation is not that efficient. But if/when you move on to other data types, the core concepts of how you process things will translate directly from basic strings.

继续阅读：haskell regex string

Regex & String Libraries in Haskell

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？