开发者

What is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner?

Given a LL(1) grammar what is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner? Please feel free to write example code in whatever language you prefer.

My Idea

symbol : either a token or a node

result : success or failure

token : a lexical token from source text
    value -> string : the value of the token
    type -> integer : the named type code of the token
    next -> token : reads the next开发者_运维问答 token and keeps position of the previous token
    back -> token : moves back to the previous position and re-reads the token

node : a node in the syntax tree 
    type -> integer : the named type code of the node
    symbols -> linkedList : the symbols found at this node
    append -> symbol -> node : appends the new symbol  to a new copy of the node

Here is an idea I have been thinking about. The main issue here is handling syntax errors. I mean I could stop at the first error but that doesn't seem right.

let program token =
    sourceElements (node nodeType.program) token        

let sourceElements node token =
    let (n, r) = sourceElement (node.append (node nodeType.sourceElements)) token
    match r with
    | success -> (n, r) 
    | failure -> // ???     

let sourceElement node token =
    match token.value with
    | "function" -> 
        functionDeclaration (node.append (node nodeType.sourceElement)) token
    | _ -> 
        statement (node.append (node nodeType.sourceElement)) token 

Please Note

I will be offering up a nice bounty to the best answer so don't feel rushed. Answers that simply post a link will have less weight over answers that show code or contain detailed explanations.

Final Note

I am really new to this kind of stuff so don't be afraid to call me a dimwit.


You want to parse something into an abstract syntax tree.

In the purely functional programming language Haskell, you can use parser combinators to express your grammar. Here an example that parses a tiny expression language:

EDIT Use monadic style to match Graham Hutton's book

    -- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)

    -- abstract syntax tree
data Expr = I Int
          | Add Expr Expr
          | Mul Expr Expr
          deriving (Eq,Show)

    -- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
    where

        -- grammar
    pExpr :: Parser String Expr
    pExpr = do
        a <- pNumber +++ parentheses pExpr  -- first argument
        do
            f <- pOp                        -- operation symbol
            b <- pExpr                      -- second argument
            return (f a b)
         +++ return a                       -- or just the first argument

    parentheses parser = do                 -- parse inside parentheses
        string "("
        x <- parser
        string ")"
        return x

    pNumber = do                            -- parse an integer
        digits <- many1 digit
        return . I . read $ digits

    pOp =                                   -- parse an operation symbol
         do string "+"
            return Add
        +++ 
         do string "*"
            return Mul

Here an example run:

*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))

To learn more about parser combinators, see for example chapter 8 of Graham Hutton's book "Programming in Haskell" or chapter 16 of "Real World Haskell".

Many parser combinator library can be used with different token types, as you intend to do. Token streams are usually represented as lists of tokens [Token].


Definitely check out the monadic parser combinator approach; I've blogged about it in C# and in F#.


Eric Lippert's blog series on immutable binary trees may be helpful. Obviously, you need a tree which is not binary, but it will give you the general idea.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜