开发者

Making an AST node the lowest descendent of a recursive rule

I am trying to make a parser rule which allows for zero or more of a token before a second rule and for which each successive token - of those which were part of the closure - is, in the AST, a child of the previous token, and the second rule is also a child of the last symbol.

easier to explain by example...

expression11 : ((NOT | COMPLEMENT)^)* expression12;

For example, given the above parser rule, if I have the expression !!x (where x is an ID), I want, in my AST, the x to be the child of the second bang operator which is the child of the first.

Desired:

!
  \ child
    !
      \ child
       x

Instead of my desired behavior, the above line produces an AST for which the second bang operator is a child of the first, but the x is a child of the first bang operator, a sibling of the second one. Obviously not what I want for a unary operator.

Encountered behavior:

        !
child /   \ child
    x -sib- !

If I add a third operator (as in "!!!x") the third one becomes a child of the second, as expected, and x remains a child of the first, sibling of the second.

I thought perhaps I could fix this by surrounding the entire operator part with parenthesis and adding another caret, such as

expression11 : (((NOT | COMPLEMENT)^)*)^ expression12;

in an effort to force expression12 to be a child of the entire closure of operators, hoping in vain that this would be interpreted as "The child of the entire closure means the child of the most-descended," but that was not the case and doing this did not change the behavior at all.

My question is "How do I get the parser to process the rule in such a way that the result of expression12 becomes the child of the most-descended 'NOT' or 'COMPLEMENT' node instead of the highest ancestor one?"

I would have thought this would be simple, but I cannot figure it out from the Antlr resources on antlr.org nor by pleading with Google. It must be done all the time, or is there a different way to structure the rule entirely which I am overlooking?

Here are the following rules for completeness. They are not finished yet and will be modified, but they are complete and working for testing and all is well with them - as expected since they are straightforward. 12 is for array length and method calls, 13 is for new classes and arrays, 14 for array indexing, and 15 for terminals/parenthesis.

express开发者_JAVA技巧ion12 : expression13 (DOT (LENGTH | (ID LPAREN (expression (COMMA expression)*)? RPAREN)))?;
expression13 : expression14 | (NEW^ ((ID LPAREN RPAREN) | (INTTYPE LSQBRACK expression RSQBRACK)));
expression14 : expression15 (LSQBRACK expression RSQBRACK)*;
expression15 : (LPAREN expression RPAREN) | INTLIT | TRUE | FALSE | ID | THIS;

Thank you to anyone who can provide assistance; your time is much appreciated.


You must not use the Kleene star if you don't want operators to appear as siblings. Try something like (untested)

expression11 : (NOT | COMPLEMENT)^ expression11
             | expression12;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜