How can I write a quick and dirty interpreter? [duplicate]
I have an interview where one of the areas I was told I might brush up on is "dynamic programming languages". So I figured I might spend this weekend writing one to bring as sample code. :-)
Of course, given the time constraints, I plan on writing something very basic and preferably using a language and/or toolset that will make it extremely easy to do. Most of my experience is in Python, but I'm willing to spend a little time learning something new if it will make the task easier (and won't take too long). Does anyone have any advice for me in terms of tools or languages that will make this easier?
If you want to write a a very simple interpretive language, you should take a look at FORTH. The lexer is trivial (tokens are space separated) and the interpreter is also very simple. And if FORTH is too retro, take a look at Scheme - you can build a tiny Scheme interpreter very quickly.
Simple interpreter written in OCaml
My example interpreter described in full here.
Types of expressions and values
Expressions:
type expr =
| EAdd of expr * expr
| EApply of expr * expr
| EEqual of expr * expr
| EIf of expr * expr * expr
| EInt of int
| ELetRec of string * string * expr * expr
| EMul of expr * expr
| EVar of string
Values:
type value =
| VInt of int
| VBool of bool
| VClosure of string * (string * value) list * expr
Lexing and parsing
Use Camlp4 for parsing:
#load "camlp4o.cma"
Define the lexer:
open Genlex
let keywords =
["("; ")"; "+"; "-"; "=";
"if"; "then"; "else";
"let"; "rec"; "in"]
let lex stream =
let rec aux = parser
| [< 'Int n when n<0; t=aux >] -> [< 'Kwd "-"; 'Int(-n); t >]
| [< 'h; t=aux >] -> [< 'h; t >]
| [< >] -> [< >] in
aux(make_lexer keywords stream)
Define the parser:
let rec parse_atom = parser
| [< 'Int n >] -> EInt n
| [< 'Ident v >] -> EVar v
| [< 'Kwd "("; e=parse_expr; 'Kwd ")" >] -> e
and parse_apply = parser
| [< e1=parse_atom; stream >] ->
(parser
| [< e2=parse_atom >] -> EApply(e1, e2)
| [< e2=parse_apply >] -> begin match e2 with
| EApply(e2, e3) -> EApply(EApply(e1, e2), e3)
| e2 -> EApply(e1, e2)
end
| [< >] -> e1) stream
and parse_arith = parser
| [< e1=parse_apply; stream >] ->
(parser
| [< 'Kwd "+"; e2=parse_arith >] -> EAdd(e1, e2)
| [< 'Kwd "-"; e2=parse_arith >] -> EAdd(e1, EMul(EInt(-1), e2))
| [< >] -> e1) stream
and parse_expr : 'a Stream.t -> expr = parser
| [< e1=parse_arith; stream >] ->
(parser
| [< 'Kwd "="; e2=parse_expr >] -> EEqual(e1, e2)
| [< >] -> e1) stream
| [< 'Kwd "if"; p=parse_expr; 'Kwd "then"; t=parse_expr;
'Kwd "else"; f=parse_expr >] ->
EIf(p, t, f)
| [< 'Kwd "let"; 'Kwd "rec"; 'Ident f; 'Ident x; 'Kwd "="; body=parse_expr;
'Kwd "in"; rest=parse_expr >] ->
ELetRec(f, x, body, rest)
Evaluation
Unbox an int or bool:
let int = function VInt n -> n | _ -> invalid_arg "int"
let bool = function VBool b -> b | _ -> invalid_arg "bool"
Evaluate an expression to give a value in the context of some bound vars
:
let rec eval vars = function
| EApply(func, arg) ->
begin
match eval vars func, eval vars arg with
| VClosure(var, vars, body), arg -> eval ((var, arg) :: vars) body
| _ -> invalid_arg "Attempt to apply a non-function value"
end
| EAdd(e1, e2) -> VInt (int(eval vars e1) + int(eval vars e2))
| EMul(e1, e2) -> VInt (int(eval vars e1) * int(eval vars e2))
| EEqual(e1, e2) -> VBool (eval vars e1 = eval vars e2)
| EIf(p, t, f) -> eval vars (if bool (eval vars p) then t else f)
| EInt i -> VInt i
| ELetRec(var, arg, body, rest) ->
let rec vars = (var, VClosure(arg, vars, body)) :: vars in
eval vars rest
| EVar s -> List.assoc s vars
Worked example
Define an example program as a string:
let program = "let rec fib n = if n=0 then 0 else if n=1 then 1 else fib(n - 1) + fib(n - 2) in fib 30"
Lex and parse the string into an AST:
let ast = parse_expr(lex(Stream.of_string program))
Evaluate the AST:
eval [] ast
You'll probably want to check out Lex and Yacc for lexing and parsing, and their python implementations
I used spark to write a pretty full featured DSL to express complicated conditionals for an old project of mine in 2 or 3 days (including unit tests).
It should be quite trivial in Python since spark (and other such modules) will give you tools you need to write a lexical and syntax analyser. You can implement a simple symbol table easily using a python dictionary and can either translate it into Python and eval or move it to some lower level language to run.
An interpreted language != a dynamic language, though the opposite is not always true.
If you are quite versed in Python (== dynamic) then I think you should do well in your interview unless they ask the difference between interpreted and dynamic languages.
I'd recommend using Haskell with parsing combinators. To bone up on parsing combinators, don't use the Wikipedia article; it's very theoretical and will likely confuse you. Instead use the paper by Graham Hutton, which is excellent.
Interpreters and compilers are the "killer app" for the ML/Haskell family of languages, and I think you'll be amazed at how quickly you can build something interesting.
For getting started I recommend you read Phil Wadler's paper The Essence of Functional Programming, which contains a number of sample interpreters organized using monads. I think the example interpreters are well organized and easy to follow, although the explanation of monads in that paper may make your head hurt.
There's also a very nice blog entry that goes through a single example in much more detail; it describes a Lisp interpreter written in Haskell. That writeup also contains a few comparisons between Haskell and Java, which may give you a feel for why many compiler writers prefer a functional language over an OO language for writing compilers and interpreters.
Have fun!!!!
The typical toy example for an interpreter is the Brainfuck language.
精彩评论