Is there a way to improve this ANTLR 3 Grammar for positive and negative integer and decimal numbers?
Is there a way to express this in a less repeative fashion with the optional positive and negative signs?
What I am trying to accomplish is how to开发者_Python百科 express optionally provide positive +
( default ) and negative -
signs on number literals that optionally have exponents and or decimal parts.
NUMBER : ('+'|'-')? DIGIT+ '.' DIGIT* EXPONENT?
| ('+'|'-')? '.'? DIGIT+ EXPONENT?
;
fragment
EXPONENT : ('e' | 'E') ('+' | '-') ? DIGIT+
;
fragment
DIGIT : '0'..'9'
;
I want to be able to recognize NUMBER
patterns, and am not so concerned about arithmetic on those numbers at that point, I will later, but I am trying to understand how to recognize any NUMBER
literals where numbers look like:
123
+123
-123
0.123
+.123
-.123
123.456
+123.456
-123.456
123.456e789
+123.456e789
-123.456e789
and any other standard formats that I haven't thought to include here.
To answer your question: no, there is no way to improve this AFAIK. You could place ('+' | '-')
inside a fragment rule and use that fragment, just like the exponent-fragment, but I wouldn't call it a real improvement.
Note that unary +
and -
signs generally are not a part of a number-token. Consider the input source "1-2"
. You don't want that to be tokenized as 2 numbers: NUMBER[1]
and NUMBER[-2]
, but as NUMBER[1]
, MINUS[-]
and NUMBER[2]
so that your parser contains the following:
parse
: statement+ EOF
;
statement
: assignment
;
assignment
: IDENTIFIER '=' expression
;
expression
: addition
;
addition
: multiplication (('+' | '-') multiplication)*
;
multiplication
: unary (('*' | '/') unary)*
;
unary
: '-' atom
| '+' atom
| atom
;
atom
: NUMBER
| IDENTIFIER
| '(' expression ')'
;
IDENTIFIER
: ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | DIGIT)*
;
NUMBER
: DIGIT+ '.' DIGIT* EXPONENT?
| '.'? DIGIT+ EXPONENT?
;
fragment
EXPONENT
: ('e' | 'E') ('+' | '-') ? DIGIT+
;
fragment
DIGIT
: '0'..'9'
;
and addition
will therefor match the input "1-2"
.
EDIT
An expression like 111.222 + -456
will be parsed as this:
and +123 + -456
as:
精彩评论