Why does Antlr think there is a missing bracket?
I've created a grammar to parse simple ldap query syntax. The grammer is:
expressi开发者_开发技巧on : LEFT_PAREN! ('&' | '||' | '!')^ (atom | expression)* RIGHT_PAREN! EOF ;
atom : LEFT_PAREN! left '='^ right RIGHT_PAREN! ;
left : ITEM;
right : ITEM;
ITEM : ALPHANUMERIC+;
LEFT_PAREN : '(';
RIGHT_PAREN : ')';
fragment ALPHANUMERIC
: ('a'..'z' | 'A'..'Z' | '0'..'9');
WHITESPACE : (' ' | '\t' | '\r' | '\n') { skip(); };
Now this grammar works fine for:
(!(attr=hello2))
(&(attr=hello2)(attr2=12))
(||(attr=hello2)(attr2=12))
However, when I try and run:
(||(attr=hello2)(!(attr2=12)))
It fails with: line 1:29 extraneous input ')' expecting EOF
If I remove the EOF off the expression grammar, everything passes, but then wrong numbers of brackets are not caught as being a syntax error. (This is being parsed into a tree, hence the ^ and ! after tokens) What have I missed?
As already mentioned by others, your expression has to end with a EOF
, but a nested expression cannot end with an EOF
, of course.
Remove the EOF
from expression
, and create a proper "entry point" for your parser that ends with the EOF
.
file: T.g
grammar T;
options {
output=AST;
}
parse
: expression EOF!
;
expression
: '('! ('&' | '||' | '!')^ (atom | expression)* ')'!
;
atom
: '('! ITEM '='^ ITEM ')'!
;
ITEM
: ALPHANUMERIC+
;
fragment ALPHANUMERIC
: ('a'..'z' | 'A'..'Z' | '0'..'9')
;
WHITESPACE
: (' ' | '\t' | '\r' | '\n') { skip(); }
;
file: Main.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "(||(attr=hello2)(!(attr2=12)))";
TLexer lexer = new TLexer(new ANTLRStringStream(source));
TParser parser = new TParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.parse().getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}
To run the demo, do:
*nix/MacOS:
java -cp antlr-3.3.jar org.antlr.Tool T.g javac -cp antlr-3.3.jar *.java java -cp .:antlr-3.3.jar Main
Windows:
java -cp antlr-3.3.jar org.antlr.Tool T.g javac -cp antlr-3.3.jar *.java java -cp .;antlr-3.3.jar Main
which produces the DOT code representing the following AST:
image created using graphviz-dev.appspot.com
In your definition of expression, there can be parentheses containing a nested expression, but the nested expression has to end in EOF. In your sample input, the nested expression doesn't end in EOF.
精彩评论