开发者

Programming a simple compiler

I am writing a compiler for a simple language.

I made a lexer/tokenizer that takes a file and prints the tokens in stdout.

Now I want to to make the syntactical analysis, but I开发者_运维知识库 don't know how to modify my lexer in order to take the tokens as input.

  • A linked list is extremely inefficient for large files (source files around 80MB take about 1.3GB of ram)
  • I could modify my lexer to give the next token every time it is called (idea taken from the Dragon Book), but I don't know what I will do if somewhere in the process I have to go back and read a previous token.

What is the right way to do these things?


Implementing a nextToken() method in the lexical analyser is the standard way. This method is called by the parser (or syntax analyser) until the entire input has been consumed.

but I dont what I will do if somewhere in the process i have to go back and read a previous token

This is not usually the case. But, what the parser may need to do is 'push back' a token (or a number of tokens depending on the lookahead of the parser) which has already been seen. In this case the lexer provides a pushBack(Token) which ensures that the next call to nextToken() will return the supplied token, rather than the next token appearing in the input.


but I dont what I will do if somewhere in the process i have to go back and read a previous token

It sounds like your matches are too greedy.

You might look into Backtracking

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜