How to handle errors during parsing in F#
I'm using fslex/fsyacc utilities for my F# Lexer and Parser. If input text has incorrect syntax it is necessary to know place where it happens.
It is possible to determine incorrect lexeme (token) in Lexer and throw an exception if it was used incorrect symbol or word:
rule token = parse
...
| integer { INT (Int32.Parse(lexeme lexbuf)) }
| "*=" { failwith "Incorrect symbol" }
| eof { EOF }
The question开发者_StackOverflow社区 is related more to Parser (fsyacc) - if input text has correct tokens and was sucessfuly tokenized by Lexer, but error happened during parsing (for example, incorrect tokens order or some absent token in the rule)
I know if catch an exception, this give position (line and column), where parsing failed:
try
Parser.start Lexer.token lexbuf
with e ->
let pos = lexbuf.EndPos
let line = pos.Line
let column = pos.Column
let message = e.Message // "parse error"
...
But is it possible (if yes - how to do it?) to determine also AST class, for which parsing failed.
For example is it possible to write something similar to following in my parser.fsy file:
Expression1:
| INT { Int $1 }
...
| _ { failwith "Error with parsing in Expression1"}
Just skipping the "_" should lead to a shift/reduce conflict. For a small set of tokens, you could list them all. For a larger set of tokens, it is more problematic.
The F# compiler does something similar by defining prefixes of earlier rules, and sets an error state:
atomicPattern:
...
| LPAREN parenPatternBody RPAREN
{ let m = (lhs(parseState)) in SynPat.Paren($2 m,m) }
| LPAREN parenPatternBody recover
{ reportParseErrorAt (rhs parseState 1) (FSComp.SR.parsUnmatchedParen()); $2 (rhs2 parseState 1 2) }
| LPAREN error RPAREN
{ (* silent recovery *) SynPat.Wild (lhs(parseState)) }
| LPAREN recover
{ reportParseErrorAt (rhs parseState 1) (FSComp.SR.parsUnmatchedParen()); SynPat.Wild (lhs(parseState))}
recover:
| error { true }
| EOF { false }
You can see the whole file in the repository.
More info on error handling in ocamlyacc/fsyacc can be found in the OCaml manual (Part III → Lexer and parser generators → Error handling).
精彩评论