OCaml lex: doesn't work at all, whatsoever

2023-02-19 19:46 问答作者：

I am at the end of my rope here. I cannot get anything to work in ocamllex, and it is driving me nuts. This is my .mll file:

{

open Parser

}

rule next = parse
  | (['a'-'z'] ['a'-'z']*) as id { Identifier id }
  | '=' { EqualsSign }
  | ';' { Semicolon }
  | '\n' | ' ' { next lexbuf }
  | eof { EOF }

Here are the contents of the file I pass in as input:

a=b;

Yet, when I compile and run the thing, I get an error on the very first character, saying it's not valid. I honestly have no idea what's going on, and Google has not helped me at all. How can this even be possible? As you can see, I'm really stumped here.

EDIT:

I was working for so long that I gave up on the parser. Now this is the relevant code in my main file:

let parse_file filename =
  let l = Lexing.from_channel (op开发者_高级运维en_in filename) in
    try
      Lexer.next l; ()
    with
      | Failure msg ->
        printf "line: %d, col: %d\n" l.lex_curr_p.pos_lnum l.lex_curr_p.pos_cnum

Prints out "line: 1, col: 1".

Without the corresponding ocamlyacc parser, nobody will be able to find the issue with your code since your lexer works perfectly fine!

I have taken the liberty of writing the following tiny parser (parser.mly) that constructs a list of identifier pairs, e.g. input "a=b;" should give the singleton list [("a", "b")].

%{%}

%token <string> Identifier
%token EqualsSign
%token Semicolon
%token EOF

%start start
%type <(string * string) list> start

%%

start:
| EOF {[]}
| Identifier EqualsSign Identifier Semicolon start {($1, $3) :: $5}
;

%%

To test whether the parser does what I promised, we create another file (main.ml) that parses the string "a=b;" and prints the result.

let print_list = List.iter (fun (a, b) -> Printf.printf "%s = %s;\n" a b)
let () = print_list (Parser.start Lexer.next (Lexing.from_string "a=b;"))

The code should compile (e.g. ocamlbuild main.byte) without any complaints and the program should output "a=b;" as promised.

In response to the latest edit:

In general, I don't believe that catching standard library exceptions that are meant to indicate failure or misuse (like Invalid_argument or Failure) is a good idea. The reason is that they are used ubiquitously throughout the library such that you usually cannot tell which function raised the exception and why it did so.

Furthermore, you are throwing away the only useful information: the error message! The error message should tell you what the source of the problem is (my best guess is an IO-related issue). Thus, you should either print the error message or let the exception propagate to the toplevel. Personally, I prefer the latter option.

However, you probably still want to deal with syntactically ill-formed inputs in a graceful manner. For this, you can define a new exception in the lexer and add a default case that catches invalid tokens.

{
  exception Unexpected_token
}
...
| _ {raise Unexpected_token}

Now, you can catch the newly defined exception in your main file and, unlike before, the exception is specific to syntactically invalid inputs. Consequently, you know both the source and the cause of the exception giving you the chance to do something far more meaningful than before.

A fairly random OCaml development hint: If you compile the program with debug information enabled, setting the environment variable OCAMLRUNPARAM to "b" (e.g. export OCAMLRUNPARAM=b) enables stack traces for uncaught exceptions!

btw. ocamllex also can do the + operator for 'one or more' in regular expressions, so this

['a'-'z']+

is equivalent to your

['a'-'z']['a'-'z']*

I was just struggling with the same thing (which is how I found this question), only to finally realize that I had mistakenly specified the path to input file as Sys.argv.(0) instead of Sys.argv.(1)! LOLs

I really hope it helps! :)

It looks like you have a space in the regular expression for identifiers. This could keep the lexer from recognizing a=b, although it should still recognize a = b ;

继续阅读：ocaml ocamllex

OCaml lex: doesn't work at all, whatsoever

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？