String literal token generates MismatchedTokenException with escape sequence token
I am currently trying to implement an Antlr parser.
I obtain strange MismatchedTokenException in a token that identifies string literals once I add escape sequence support.Following is the Antlr parser example that causes the issue:
rule: STRING_LITERAL ;
STRING_LITERAL
:
'"' STRING_GUTS '"'
;
fragment
STRING_GUTS
:
( ESC | ~('\\' | '"') )*
;
ESC
:
'\\'
( '\\' | '"' )
;
Do you seen any 开发者_StackOverflowissue in this code?
Note that if I remove ESC from the STRING_GUTS, the string parsing is working well...
You'll have to post the input you're getting this error with, the ANTLR version you're using, and the way you're running your test(s), because I see no problem with that grammar, as you can see:
T.g
grammar T;
rule
: STRING_LITERAL {System.out.println("parsed : " + $STRING_LITERAL.text);}
;
STRING_LITERAL
: '"' STRING_GUTS '"'
;
fragment
STRING_GUTS
: (ESC | ~('\\' | '"'))*
;
// also a fragment rule perhaps?
ESC
: '\\' ('\\' | '"')
;
Main.java
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String src = "\"a\\\"b\\\\c\"";
TLexer lexer = new TLexer(new ANTLRStringStream(src));
TParser parser = new TParser(new CommonTokenStream(lexer));
System.out.println("src : " + src);
parser.rule();
}
}
If I generate a lexer and parser from you grammar (1), compile all java-source files (2) and run the Main class (3):
java -cp antlr-3.3.jar org.antlr.Tool T.g # 1
javac -cp antlr-3.3.jar *.java # 2
java -cp .;antlr-3.3.jar Main # 3
The following is printed to the console:
src : "a\"b\\c"
parsed : "a\"b\\c"
I.e.: the input src
is parsed as expected.
If you're encountering problems with ANTLRWorks' interpreter: don't use it, it's a bit buggy. Either use ANTLRWorks' debugger, or use a custom class as I did above.
精彩评论