开发者

Is there a way to improve this ANTLR 3 Grammar for positive and negative integer and decimal numbers?

Is there a way to express this in a less repeative fashion with the optional positive and negative signs?

What I am trying to accomplish is how to开发者_Python百科 express optionally provide positive + ( default ) and negative - signs on number literals that optionally have exponents and or decimal parts.

NUMBER : ('+'|'-')? DIGIT+ '.' DIGIT* EXPONENT?
       | ('+'|'-')? '.'? DIGIT+ EXPONENT?
       ;

fragment 
EXPONENT : ('e' | 'E') ('+' | '-') ? DIGIT+ 
         ;

fragment
DIGIT  : '0'..'9' 
       ;

I want to be able to recognize NUMBER patterns, and am not so concerned about arithmetic on those numbers at that point, I will later, but I am trying to understand how to recognize any NUMBER literals where numbers look like:

123
+123
-123
0.123
+.123
-.123
123.456
+123.456
-123.456
123.456e789
+123.456e789
-123.456e789 

and any other standard formats that I haven't thought to include here.


To answer your question: no, there is no way to improve this AFAIK. You could place ('+' | '-') inside a fragment rule and use that fragment, just like the exponent-fragment, but I wouldn't call it a real improvement.

Note that unary + and - signs generally are not a part of a number-token. Consider the input source "1-2". You don't want that to be tokenized as 2 numbers: NUMBER[1] and NUMBER[-2], but as NUMBER[1], MINUS[-] and NUMBER[2] so that your parser contains the following:

parse
  :  statement+ EOF
  ;

statement
  :  assignment
  ;

assignment
  :  IDENTIFIER '=' expression
  ;

expression
  :  addition
  ;

addition
  :  multiplication (('+' | '-') multiplication)*
  ;

multiplication
  :  unary (('*' | '/') unary)*
  ;

unary
  :  '-' atom
  |  '+' atom
  |  atom
  ;

atom
  :  NUMBER
  |  IDENTIFIER
  |  '(' expression ')'
  ;

IDENTIFIER
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | DIGIT)*
  ;

NUMBER 
  :  DIGIT+ '.' DIGIT* EXPONENT?
  |  '.'? DIGIT+ EXPONENT?
  ;

fragment 
EXPONENT 
  :  ('e' | 'E') ('+' | '-') ? DIGIT+ 
  ;

fragment
DIGIT  
  :  '0'..'9' 
  ;

and addition will therefor match the input "1-2".

EDIT

An expression like 111.222 + -456 will be parsed as this:

Is there a way to improve this ANTLR 3 Grammar for positive and negative integer and decimal numbers?

and +123 + -456 as:

Is there a way to improve this ANTLR 3 Grammar for positive and negative integer and decimal numbers?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜