开发者

Antlr parsing numbers problem

I have a problem parsing integer &am开发者_高级运维p; hex numbers. I want to parse C++ enums with the following rules:

grammar enum;

rule_enum
:   'enum' ID '{' enum_values+ '}'';';

enum_values
:   enum_value (COMMA enum_value)+;

enum_value
:   ID ('=' number)?;

number  :   hex_number | integer_number;

hex_number
:   '0' 'x' HEX_DIGIT+;

integer_number
:   DIGIT+;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
DIGIT   :   ('0'..'9');

COMMA   :   ',';


ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

The problem I have is the following - when parsing code like:

enum Enum
{
    Option1 = 0,
    Option2 = 1
};

it does not recognize the 0 as integer_number but tries to parse it as hex_number. How can I resolve this?

Thank you. Tobias


First, fragment rules can only be "seen" by lexer rules, not parser rules. So, the following is invalid:

integer_number
:   DIGIT+; // can't use DIGIT here!

fragment
DIGIT   :   ('0'..'9');

To fix your ambiguity with these numbers, it's IMO best to make these integer- and hex numbers lexer rules instead of parser rules.

An example:

grammar enum;

rule_enum
:   'enum' ID '{' enum_values+ '}'';';

enum_values
:   enum_value (COMMA enum_value)+;

enum_value
:   ID ('=' number)?;

number
  :  HEX_NUMBER
  |  INTEGER_NUMBER
  ;

HEX_NUMBER
:   '0' 'x' HEX_DIGIT+;

INTEGER_NUMBER
:   DIGIT+;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
DIGIT   :   ('0'..'9');

COMMA   :   ',';

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

SPACE : (' ' | '\t' | '\r' | '\n') {skip();};

which produces the following parse tree of your example snippet:

Antlr parsing numbers problem


The following ANTLR works for just the number bit of the enum. (editted to include Bart's advice below)

grammar enum;

number  :   
    integer_number | hex_number ;

hex_number
    :   HEX_NUMBER;

integer_number
    :   INT_NUMBER;


HEX_NUMBER
    :   HEX_INTRO HEX_DIGIT+;

INT_NUMBER
    :   DIGIT+;

HEX_INTRO
    :   '0x';


DIGIT   :   ('0'..'9');


HEX_DIGIT 
    : ('0'..'9'|'a'..'f'|'A'..'F') ;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜