开发者

How to write the Visitor Pattern for Abstract Syntax Tree in Python?

My collegue suggested me to write a visitor pattern to navigate the AST. Can an开发者_StackOverflow中文版yone tell me more how would I start writing it?

As far as I understand, each Node in AST would have visit() method (?) that would somehow get called (from where?). That about concludes my understanding.

To simplify everything, suppose I have nodes Root, Expression, Number, Op and the tree looks like this:

       Root
        |
       Op(+)
      /   \
     /     \
 Number(5)  \
             Op(*)
             /   \
            /     \
           /       \
       Number(2)   Number(444)

Can anyone think of how the visitor pattern would visit this tree to produce output:

 5 + 2 * 444

Thanks, Boda Cydo.


Wikipedia has a great overview of how the Visitor pattern works, although the sample implementation that they use is in Java. You can easily port that to Python, though, no?

Basically, you want to implement a mechanism for double dispatch. Each node in your AST would need to implement an accept() method (NOT a visit() method). The method takes, as an argument, a visitor object. In the implementation of this accept() method, you call a visit() method of the visitor object (there will be one for each AST node type; in Java, you'll use parameter overloading, in Python I suppose you can use different visit_*() methods). The correct visitor will then be dispatched with the correct Node type as argument.


The two variants to implement the Visitor pattern in Python that I encountered on the Internet most often:

  • One-to-one translation of the example from the Design Patterns book by Gamma et al.
  • Using additional modules for double-dispatch

Translated Example from the Design Patterns Book

This variant uses accept() methods in the data structure classes and corresponding visit_Type() methods in the visitors.

The data structure

class Operation(object):
    def __init__(self, op, arg1, arg2):
        self.op = op
        self.arg1 = arg1
        self.arg2 = arg2
    def accept(self, visitor):
        visitor.visitOperation(self)

class Integer(object):
    def __init__(self, num):
        self.num = num
    def accept(self, visitor):
        visitor.visitInteger(self)

class Float(object):
    def __init__(self, num):
        self.num = num
    def accept(self, visitor):
        visitor.visitFloat(self)
    
expression = Operation('+', Integer('5'),
                            Operation('*', Integer('2'), Float('444.1')))

Infix Print Visitor

class InfixPrintVisitor(object):
    def __init__(self):
        self.expression_string = ''
    def visitOperation(self, operation):
        operation.arg1.accept(self)
        self.expression_string += ' ' + operation.op + ' '
        operation.arg2.accept(self)
    def visitInteger(self, number):
        self.expression_string += number.num
    def visitFloat(self, number):
        self.expression_string += number.num

Prefix Print Visitor

class PrefixPrintVisitor(object):
    def __init__(self):
        self.expression_string = ''
    def visitOperation(self, operation):
        self.expression_string  += operation.op + ' '
        operation.arg1.accept(self)
        self.expression_string  += ' '
        operation.arg2.accept(self)
    def visitInteger(self, number):
        self.expression_string += number.num
    def visitFloat(self, number):
        self.expression_string += number.num

Test

infixPrintVisitor = InfixPrintVisitor()
expression.accept(infixPrintVisitor)
print(infixPrintVisitor.expression_string)
prefixPrintVisitor = PrefixPrintVisitor()
expression.accept(prefixPrintVisitor)
print(prefixPrintVisitor.expression_string)

Output

5 + 2 * 444.1
+ 5 * 2 444.1

Using additional modules

This variant uses @functools.singledispatch() decorator (available in the Python Standard Library since Python v3.4).

The data structure

class Operation(object):
    def __init__(self, op, arg1, arg2):
        self.op = op
        self.arg1 = arg1
        self.arg2 = arg2
    
class Integer(object):
    def __init__(self, num):
        self.num = num

class Float(object):
    def __init__(self, num):
        self.num = num
    
expression = Operation('+', Integer('5'), 
                            Operation('*', Integer('2'), Float('444.1')))

Infix Print Visitor

from functools import singledispatch

@singledispatch
def visitor_print_infix(obj):
    pass
@visitor_print_infix.register(Operation)
def __(operation):
    return visitor_print_infix(operation.arg1) + ' ' \
               + operation.op + ' ' \
               + visitor_print_infix(operation.arg2)
@visitor_print_infix.register(Integer)
@visitor_print_infix.register(Float)
def __(number):
    return number.num

Prefix Print Visitor

from functools import singledispatch

@singledispatch
def visitor_print_prefix(obj):
    pass
@visitor_print_prefix.register(Operation)
def __(operation):
    return operation.op + ' ' \
               + visitor_print_prefix(operation.arg1) + ' ' \
               + visitor_print_prefix(operation.arg2)
@visitor_print_prefix.register(Integer)
@visitor_print_prefix.register(Float)
def __(number):
    return number.num

Test

print(visitor_print_infix(expression))
print(visitor_print_prefix(expression))

Output

5 + 2 * 444.1
+ 5 * 2 444.1

The reason I prefer this variant is that it eliminates the accept() methods and completely separates the data structure from the operations implemented in the visitors. Extending the data structure with new elements does not require changing the visitors. The visitors ignore the unknown element types by default (see the definitions with the pass keyword). A drawback of this method is that singledispatch decorator cannot be used with instance methods directly, although, there are ways to make it work. Note: Starting with Python 3.8 functools.singledispatchmethod() provides the same functionality as functools.singledispatch() but for instance methods.

For Python before v3.4 multimethods module can be used similar to the singledispatch decorator. One drawback of the multimethods module is that the visitor method that is applied to a given data-structure element is selected not only based on the element's type but also on the order in which the methods are declared. Keeping the method definitions in the right order can be cumbersome and error prone for data structures with a complex inheritance hierarchy.


See the docs for ast.NodeVisitor, e.g. a crude possibility might be:

import ast

class MyVisitor(ast.NodeVisitor):
  def visit_BinaryOp(self, node):
    self.visit(node.left)
    print node.op,
    self.visit(node.right)
  def visit_Num(self, node):
    print node.n,

of course this doesn't emit parentheses even where needed, etc, so there's actually more work done, but, it's a start;-).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜