开发者

ANTLR grammar: parser- and lexer literals

What's the difference between this grammar:

...
if_statement : 'if' condition 'then' statement 'else' statement 'end_if';
... 

and this:

...
if_statement : IF condition THEN statement ELSE statement END_IF;
...

IF : 'if';
THEN: 'then';
ELSE: 'els开发者_如何学Goe';
END_IF: 'end_if';
....

?

If there is any difference, as this impacts on performance ... Thanks


In addition to Will's answer, it's best to define your lexer tokens explicitly (in your lexer grammar). In case you're mixing them in your parser grammar, it's not always clear in what order the tokens are tokenized by the lexer. When defining them explicitly, they're always tokenized in the order they've been put in the lexer grammar (from top to bottom).


The biggest difference is one that may not matter to you. If your Lexer rules are in the lexer then you can use inheritance to have multiple lexer's share a common set of lexical rules.

If you just use strings in your parser rules then you can not do this. If you never plan to reuse your lexer grammar then this advantage doesn't matter.

Additionally I, and I'm guessing most Antlr veterans, are more accustom to finding the lexer rules in the actual lexer grammar rather than mixed in with the parser grammar, so one could argue the readability is increased by putting the rules in the lexer.

There is no runtime performance impact after the Antlr parser has been built to either approach.


The only difference is that in your first production rule, the keyword tokens are defined implicitly. There is no run-time performance implication for tokens defined implicitly vs. explicitly.


Yet another difference: when you explicitly define your lexer rules you can access them via the name you gave them (e.g. when you need to check for a specific token type). Otherwise ANTLR will use arbitrary numbers (with a prefix).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜