Parsec - 'many' and error messages
When I try to parse many p
, I don't receive the 'expecting p' message:
> parse (many (char '.') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting end of input
Compare to
> parse (sepBy (char '.') (char ',') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting "." or end of input
which reports "." as I'd expect. many1 p <|> return []
works as well.
All of these functions accept empty input, so why doesn't many
report what it's expecting? Is 开发者_如何学Goit a bug or a feature?
You'll get better error messages with manyTill
:
> parse (manyTill (char '.') eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting end of input or "."
This is just due to the way you chain with >>
. If the first parser succeeds, then the second one will be run. many
succeeds, so eof
is tried. eof
fails so you only get eof
's error message.
With manyTill
, it tries both parsers (the second first) and, if both fail, the error messages are combined (this is because it uses <|>
internally).
On the whole, though, it's easier to define your own errors with <?>
:
> parse (many (char '.') >> eof <?> "lots of dots") "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting lots of dots
In a somewhat superficial sense, the reason for the difference in behavior is that many
is a primitive parser whereas sepBy
is constructed in a similar manner to your reimplemented many
. In the latter case, the "expecting..." message is constructed based on alternatives that were available along the path that led to the parse failure; with many
there were no such choices, it merely succeeded unconditionally.
I don't know that I'd describe this as either a bug or a feature, it's just sort of a quirk of how Parsec works. Error handling is not really Parsec's strength and this really doesn't seem like the first thing I'd worry about in that regard. If it bothers you sufficiently you may be better served by looking into other parsing libraries. I've heard good things about uu-parsinglib, for instance.
From haddock
many p applies the parser p zero or more times. Returns a list of the returned values of p.
So empty string is a valid input for many
combinator.
[Added]
Ah, now I see your point. expecting a or b
is reported when <|>
(choice combinator) is used. many
is implemented without using <|>
, but sepBy
uses it internally.
This is a bug introduced in parsec-3.1. If you test with prior versions you should get an error message like this:
> parse (many (char '.') >> eof) "" "a"
Left (line 1, column 1):
unexpected 'a'
expecting "." or end of input
At least, that's what I get after fixing the bug :-)
精彩评论