Using C++ types in an ANTLR-generated C parser
I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language. The generated parser can, in theory, be compiled as C++, but I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser:
/* expr.h */
enum Kind {
PLUS,
MINUS
};
class Expr { // stub
};
class ExprFactory {
public:
Expr mkExpr(Kind kind, Expr op1, Expr op2);
Expr mkInt(std::string n);
};
And here's a simple parser definition:
/* Expr.g */
grammar Expr;
options {
language = 'C';
}
@parser::includes {
#include "expr.h"
}
@members {
ExprFactory *exprFactory;
}
start returns [Expr expr]
: e = expression EOF { $expr = e; }
;
expression returns [Expr e]
: TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k,op1,op2); }
| INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); }
;
builtinOp returns [Kind kind]
: TOK_PLUS { kind = PLUS; }
| TOK_MINUS { kind = MINUS; }
;
TOK_PLUS : '+';
TOK_MINUS : '-';
TOK_LPAREN :开发者_Go百科 '(';
TOK_RPAREN : ')';
INTEGER : ('0'..'9')+;
The grammar runs through ANTLR just fine. When I try to compile ExprParser.c, I get errors like
conversion from ‘long int’ to non-scalar type ‘Expr’ requested
no match for ‘operator=’ in ‘e = 0l’
invalid conversion from ‘long int’ to ‘Kind’
In each case, the statement is an initialization of an Expr
or Kind
value to NULL
.
I can make the problem go away for the Expr
's by changing everything to Expr*
. This is workable, though hardly ideal. But passing around pointers for a simple enum like Kind
seems ridiculous. One ugly workaround I've found is to create a second return value, which pushes the Kind
value into a struct and suppresses the initialization to NULL
. I.e, builtinOp
becomes
builtinOp returns [Kind kind, bool dummy]
: TOK_PLUS { $kind = PLUS; }
| TOK_MINUS { $kind = MINUS; }
;
and the first expression
alternative becomes
TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k.kind,*op1,*op2); }
There has to be a better way to do things? Am I missing a configuration option to the C language backend? Is there another way to arrange my grammar to avoid this awkwardness? Is there a pure C++ backend I can use?
Here are the solutions I have found to this problem. The crux of the issue is that ANTLR wants to initialize all return values and attributes. For non-primitive types, ANTLR just assumes it can initialize with NULL
. So, for example, the expression
rule above will be translated into something like
static Expr
expression(pExprParser ctx)
{
Expr e = NULL; // Declare and init return value
Kind k; // declare attributes
Expr op1, op2;
k = NULL; // init attributes
op1 = NULL;
op2 = NULL;
...
}
The choices, as I see them, are these:
Give the values primitive types that can legally be initialized to
NULL
. E.g., useExpr*
andKind*
instead ofExpr
andKind
.Use the "dummy" trick, as above, to push the value into a structure where it won't be initialized.
Use reference parameters instead of return values. E.g.,
builtinOp[Kind& kind] : TOK_PLUS { kind = PLUS; } | TOK_MINUS { kind = MINUS; } ;
Augment the classes used as value types with operations that make the above declarations and initializations legal. I.e., for a
Expr
return value, you need a constructor that can takeNULL
:Expr(long int n);
For an
Expr
attribute, you need a no-arg constructor and anoperator=
that can takeNULL
:Expr(); Expr operator=(long int n);
I know it is pretty hacky, but I'm going with #4 for the time being. It just so happens that my Expr
class has a fairly natural definition of these operations.
P.S. On the ANTLR list, the maintainer of the C backend hints that this problem may be solved in future releases.
精彩评论