开发者

Parsing ambiguous input with Antlr

I have been trying for a few days to parse some text that consists of text and numbers (I've called it a sentence in my grammar).

    sentence options {
          greedy=false;
     } 
         : (ANY_WORD | INT)+;

I have a rule that needs to parse a sentence that finishes with an INT

    sentence_with_int 
        : sentence INT;

So if I had some input that was " the number of size 14 shoes bought was 3 " then sentence_with_int would be matched not just sentence. I'm sure there is 开发者_StackOverflow社区a better way to do this but I'm just learning the tool.

Thanks, Richard


Your grammar:


grammar Test;

sentence_with_int 
  :  sentence {System.out.println("Parsed: sentence='"+$sentence.text+"'");}
     INT      {System.out.println("Parsed: int='"+$INT.text+"'");}
  ;

sentence
  : (ANY_WORD | INT)+
  ;

ANY_WORD
  :  ('a'..'z' | 'A'..'Z')+
  ;

INT
  :  ('0'..'9')+
  ;

WS  
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

does exactly that. Here's a little test harness:

import org.antlr.runtime.*;

public class Demo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("the number of size 14 shoes bought was 3");
        TestLexer lexer = new TestLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        TestParser parser = new TestParser(tokens);
        parser.sentence_with_int();
    }
}

First generate a parser & lexer (assuming all your files, and the ANTLR jar, are in the same directory):

java -cp antlr-3.2.jar org.antlr.Tool Test.g

and compile all .java source files:

javac -cp antlr-3.2.jar *.java

and finally run the Demo class:

java -cp .:antlr-3.2.jar Demo

(on Windows, replace the : with a ;)

which produces the following output:

Parsed: sentence='the number of size 14 shoes bought was'
Parsed: int='3'
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜