Haskell: Delimit a string by chosen sub-strings and whitespace
Am still new to Haskell, so apologize if there is an obvious answer to this...
I would like to make a function that splits up the all following lists of strings i.e. [String]:["int x = 1", "y := x + 123"]
["int x= 1", "y:= x+123"]
["int x=1", "y:=x+123"]
All into the same string of strings i.e. [[String]]:
[["int", "x", "=", "1"], ["y", ":=", "x", "+", "123"]]
You can use map words.lines
for the first [String].
But I do not know any really neat ways to also take into acco开发者_如何转开发unt the others - where you would be using the various sub-strings "="
, ":="
, "+"
etc. to break up the main string.
Thank you for taking the time to enlighten me on Haskell :-)
The Prelude comes with a little-known handy function called lex
, which is a lexer for Haskell expressions. These match the form you need.
lex :: String -> [(String,String)]
What a weird type though! The list is there for interfacing with a standard type of parser, but I'm pretty sure lex
always returns either 1 or 0 elements (0 indicating a parse failure). The tuple is (token-lexed, rest-of-input)
, so lex
only pulls off one token. So a simple way to lex a whole string would be:
lexStr :: String -> [String]
lexStr "" = []
lexStr s =
case lex s of
[(tok,rest)] -> tok : lexStr rest
[] -> error "Failed lex"
To appease the pedants, this code is in terrible form. An explicit call to error
instead of returning a reasonable error using Maybe
, assuming lex
only returns 1 or 0 elements, etc. The code that does this reliably is about the same length, but is significantly more abstract, so I spared your beginner eyes.
I would take a look at parsec and build a simple grammar for parsing your strings.
how about using words .)
words :: String -> [String]
and words wont care for whitespaces..
words "Hello World"
= words "Hello World"
= ["Hello", "World"]
精彩评论