Antlr parsing numbers problem
I have a problem parsing integer &am开发者_高级运维p; hex numbers. I want to parse C++ enums with the following rules:
grammar enum;
rule_enum
: 'enum' ID '{' enum_values+ '}'';';
enum_values
: enum_value (COMMA enum_value)+;
enum_value
: ID ('=' number)?;
number : hex_number | integer_number;
hex_number
: '0' 'x' HEX_DIGIT+;
integer_number
: DIGIT+;
fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
DIGIT : ('0'..'9');
COMMA : ',';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
The problem I have is the following - when parsing code like:
enum Enum
{
Option1 = 0,
Option2 = 1
};
it does not recognize the 0 as integer_number but tries to parse it as hex_number. How can I resolve this?
Thank you. Tobias
First, fragment rules can only be "seen" by lexer rules, not parser rules. So, the following is invalid:
integer_number
: DIGIT+; // can't use DIGIT here!
fragment
DIGIT : ('0'..'9');
To fix your ambiguity with these numbers, it's IMO best to make these integer
- and hex
numbers lexer rules instead of parser rules.
An example:
grammar enum;
rule_enum
: 'enum' ID '{' enum_values+ '}'';';
enum_values
: enum_value (COMMA enum_value)+;
enum_value
: ID ('=' number)?;
number
: HEX_NUMBER
| INTEGER_NUMBER
;
HEX_NUMBER
: '0' 'x' HEX_DIGIT+;
INTEGER_NUMBER
: DIGIT+;
fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
DIGIT : ('0'..'9');
COMMA : ',';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
SPACE : (' ' | '\t' | '\r' | '\n') {skip();};
which produces the following parse tree of your example snippet:
The following ANTLR works for just the number bit of the enum. (editted to include Bart's advice below)
grammar enum;
number :
integer_number | hex_number ;
hex_number
: HEX_NUMBER;
integer_number
: INT_NUMBER;
HEX_NUMBER
: HEX_INTRO HEX_DIGIT+;
INT_NUMBER
: DIGIT+;
HEX_INTRO
: '0x';
DIGIT : ('0'..'9');
HEX_DIGIT
: ('0'..'9'|'a'..'f'|'A'..'F') ;
精彩评论