Scala parser combinators: how to parse "if(x)" if x can contain a ")"
I'm trying to get this to work:
def emptyCond: Parser[Cond] = ("if" ~ "(") ~> regularStr <~ ")" ^^ { case s => Cond("",Nil,Nil) }
where regularStr is defined to开发者_JAVA技巧 accept a number of things, including ")". Of course, I want this to be an acceptable input: if(foo()). But for any if(x) it is taking the ")" as part of the regularStr and so this parser never succeeds.
What am I missing?
Edit:
regularStr is not a regular expression. It is defined thus:
def regularStr = rep(ident | numericLit | decimalLit | stringLit | stmtSymbol) ^^ { case s => s.mkString(" ") }
and the symbols are:
val stmtSymbol = "*" | "&" | "." | "::" | "(" | ")" | "*" | ">=" | "<=" | "=" |
"<" | ">" | "|" | "-" | "," | "^" | "[" | "]" | "?" | ":" | "+" |
"-=" | "+=" | "*=" | "/=" | "&&" | "||" | "&=" | "|="
I don't need exhaustive language check, just the control structures. So I don't really care what's inside "()" in if(), I want to accept any sequence of identifiers, symbols, etc. So, for my purposes even if())) should be valid, where "))" is the if's "condition".
A regular expression cannot recognize a language that has nested, balanced constructs such as (...)
, [...]
, {...}
, etc. So you're going to need to use further context-free productions (not regular expressions) to match the regularStr
portions.
OK, accepting if())) was not really a requirement, just an example of what I would be willing to accept in order to make my parsing as cheap as possible, to just worry about capturing control structures.
However it appears I can't be so cheap and still have it work. So, since the if() construct has parenthesis, all I have to do is expect what's inside to have well balanced parenthesis. A closing ")" where one isn't expected cannot be part of the condition.
I did this:
val regularNoParens = ident | numericLit | decimalLit | stringLit | stmtSymbol
def regularParens: Parser[String] = "(" ~ rep(regularNoParens | regularParens) ~ ")" ^^ { case l ~ s ~ r => l + s.mkString(" ") + r }
def regularStr = rep(regularNoParens | regularParens) ^^ { case s => s.mkString(" ") }
And I took out "(" and ")" from stmtSymbol. Works!
Edit: it didn't support nesting, fixed it.
精彩评论