Tree structure for AST

2023-02-07 16:07 问答作者：

I am trying to write an interpreter in C# and I am on a parsing stage. I figured out I have to generate Abstract Syntax Tree at this point, but I don't know how to represent it in C#.

Cur开发者_如何转开发rently I am just using List<object>, but I have a feeling that I am doing it wrong.

Thanks a lot.

There are numerous techniques, ranging from a single "node" type — a big bucket of fields with a "type" tag — to a fine-grained hierarchy of specific classes. The important thing is to think about how to make traversal code simple and robust, because you will need to traverse this data structure over and over again.

Taking a step back, however, interpretation doesn't strictly require an AST. Many earlier interpreters would quite literally read each line of code as they encountered it, parsing and executing it on the fly, with loops, etc., implemented via a stack-based book-marking system. I suspect shell languages like bash and cmd.exe still work like this today.

I suggest that you continue using List<object> until you understand clearly what limitations does this impose and what your requirements are. Or, since you are writing a LISP interpreter, create a pair class and use that with object, null being the equivalent of '():

public sealed class Pair
{
    public object Car ;
    public object Cdr ;
}

Parse your input directly into an S-expression and have your interpreter work directly with that. After all, LISP existed before ASTs!

Most AST node implementations are pretty simple.

They are a struct (ok, ok, "class") containing a node type (usually an integer), a list of children (List is OK; high performance implementations have a set of members for statistically common 1st, 2nd, 3rd children), and some additional fields to carry values specific to the AST node instance, (e.g., the value "5" for the AST node "integer constant"). To enable efficient navigation of the tree from any node back to parents, there is often a special reference back to the parent node.

What's harder is deciding what set of AST nodes you should have. For a large grammar, this is inconvenient as you'll have to define a set of several hundred of them, and there will be churn as you modify the grammar in your attempts to get it right.

A simple trick is simply define one AST node per grammar rule. (This would be called a "concrete syntax tree" by most). But it is brainless and you don't miss anything.

Our DMS Software Reengineering Toolkit follows this "simple trick" idea, generating the AST node types dirrectly from the grammar rules. It additionally optimizes: leaf AST nodes that don't carry values aren't present in the tree, nodes for unary productions aren't present in the tree, and list-forming productions have the List style of children, while other node types have the fixed slot types for children. The net result is you what is pretty close to a "abstract" syntax tree anyway. All of this is automatically constructed by DMS's parser generator so you don't have think at all.

DMS also has a full, well-test C# 4.0 front end. Once you get past the headache of defining the AST, you'll then want to analyze/transform/generate from it, and the rest of DMS will suddenly become obviously valuable.

继续阅读：.net abstract-syntax-tree tree

Tree structure for AST

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？