开发者

Haskell: Delimit a string by chosen sub-strings and whitespace

Am still new to Haskell, so apologize if there is an obvious answer to this...

I would like to make a function that splits up the all following lists of strings i.e. [String]:

["int x = 1", "y := x + 123"]
["int   x=   1", "y:=   x+123"] 
["int x=1", "y:=x+123"] 

All into the same string of strings i.e. [[String]]:

[["int", "x", "=", "1"], ["y", ":=", "x", "+", "123"]]

You can use map words.lines for the first [String].

But I do not know any really neat ways to also take into acco开发者_如何转开发unt the others - where you would be using the various sub-strings "=", ":=", "+" etc. to break up the main string.

Thank you for taking the time to enlighten me on Haskell :-)


The Prelude comes with a little-known handy function called lex, which is a lexer for Haskell expressions. These match the form you need.

lex :: String -> [(String,String)]

What a weird type though! The list is there for interfacing with a standard type of parser, but I'm pretty sure lex always returns either 1 or 0 elements (0 indicating a parse failure). The tuple is (token-lexed, rest-of-input), so lex only pulls off one token. So a simple way to lex a whole string would be:

lexStr :: String -> [String]
lexStr "" = []
lexStr s = 
    case lex s of
        [(tok,rest)] -> tok : lexStr rest
        []           -> error "Failed lex"

To appease the pedants, this code is in terrible form. An explicit call to error instead of returning a reasonable error using Maybe, assuming lex only returns 1 or 0 elements, etc. The code that does this reliably is about the same length, but is significantly more abstract, so I spared your beginner eyes.


I would take a look at parsec and build a simple grammar for parsing your strings.


how about using words .) words :: String -> [String] and words wont care for whitespaces..

words "Hello World"
= words "Hello     World"
= ["Hello", "World"]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜