开发者

Writing language converter in ANTLR

I'm writing a converter between some dialects of the same programming language. I've found a grammar on the net - it's complex and handles all the cases. Now I'm trying to write the appropriate actions.

Most of the input is just going to be rewritten to output. What I need to do is parse function calls, do my magic (rename function, reorder arguments, etc) and write it.

I'm using AST as output. When I come across a function call, I build a custom object structure (from classes defined in my target language), call the appropriate function and I have a string that represents the transformed function that I want to get.

The problem is, what I'm supposed to do with that string? I'd like to replace the .text attribute of the enclosing rule, but setText() is only available on lexer rules and the rule's .text attribute is read-only. How to solve this problem?

program
    : statement_list            { output = $statement_list.text; }
    ;

//...

statement
    :   expression_statement
    // ...
    ;

expression_statement
    : function_call
    // ...
    ;

function_call
    : ID '('                    { /* build the object, assign name */
                                  Function function = new Function();
                                  //...
                                }
      (
      arg1 = expression         { /* add first parameter */ }
      ( ',' arg2 = expression   { /* add the rest of parameters */ }
      )*
      )?
      ')'                       { /* convert the function call */
                                  string converted = Tools.Convert(function);
                                  /开发者_高级运维/ $setText(converted);               // doesn't work
                                  // $functionCall.text = converted;    // doesn't work
                                }
    ;


Once you have an AST, you'll need to write a tree walker that emits your program as the transformed source. You might even have an intermediate tree walker doing tree transformations depending upon the complexity of your changes.

That said, going through an AST step, may not be the best approach.

You might want to take a look at "Language Design Patterns" by Terrence Parr (Pragmatic Programmers). Chapter 11 addresses your type of program.

He mentions a tool, ANTLRMorph, that might be better suited for your problem.


The easiest way is to create a rewriter. Set grammar to rewrite, use templates and create a template in-place. Then use TokenRewriteStream and it's ToString() method.

grammar Test;

options {
    language = CSharp2;
    output = template;
    rewrite = true;
}

program
    : statement_list
    ;

//...

statement
    :   expression_statement
    // ...
    ;

expression_statement
    : function_call
    // ...
    ;

function_call
    : ID '('                    { /* build the object, assign name */
                                  Function function = new Function();
                                  //...
                                }
      (
      arg1 = expression         { /* add first parameter */ }
      ( ',' arg2 = expression   { /* add the rest of parameters */ }
      )*
      )?
      ')' -> { new StringTemplate(Tools.Convert(function)) }
    ;

And the driver:

    string input = "....";

    var stream = new ANTLRStringStream(input);
    var lexer = new TestLexer(stream);

    // need to use TokenRewriteStream
    var tokenStream = new TokenRewriteStream(lexer);
    var parser = new TestParser(tokenStream);

    parser.program();

    // original text
    Console.WriteLine(tokenStream.ToOriginalString());
    // rewritten text
    Console.WriteLine(tokenStream.ToString());
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜