ANTLR treats part of string as a keyword

2023-01-27 05:59 问答作者：

I'm currently learning ANTLR for myself. First of I开发者_开发知识库 decided to write the simplest grammar. There is plain text file with directives:

pid = something.pid
log = something.log

The grammar I wrote is:

grammar TestGrammar;

options {
  language = Java;
}

@header {
  package test.antlr;
}

@lexer::header {
  package test.antlr;
}

program
  : directive+
  ;

directive
  : pid
  | log
  ;

pid
  : PID EQ (WORD|POINT)+
  ;

log
  : LOG EQ (WORD|POINT)+
  ;

WS: ( ' '
    | '\t'
    | '\r'
    | '\n'
    ) {$channel=HIDDEN;}
    ;

PID
  : 'pid'
  ;

LOG
  : 'log'
  ;

EQ
  : '='
  ;

POINT
  : '.'
  ;

WORD
  : ('a'..'z'|'A'..'Z'|'_')+
  ;

I feel I made a mistake somewhere and ANTLR proves that throwing MismatchedTokenException. It treats something.pid as a directive and throws an exception.

However I don't understand what am I doing wrong. Any help will be appreciated.

Thanks.

The lexer is a very simple object: without interference from the parser, it tokenizes the input source. So, the input:

pid = something.pid

is not tokenized as:

PID EQ WORD POINT WORD

but as:

PID EQ WORD POINT PID

That's why your rule:

pid
  : PID EQ (WORD|POINT)+
  ;

matches "pid = something." and leaves the second "pid" in the token-stream, expecting an EQ atfer it (hence the exception).

A possible fix would be to do something like this:

pid
  : PID EQ (word|POINT)+
  ;

log
  : LOG EQ (word|POINT)+
  ;

word
  : WORD
  | PID
  | LOG 
  ;

Or by doing something like:

pid
  : PID EQ FULL_WORD
  ;

log
  : LOG EQ FULL_WORD
  ;

// ...

FULL_WORD
  : WORD (POINT WORD)*
  ;

// ...

继续阅读：antlr

ANTLR treats part of string as a keyword

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？