Language Parser with conditions .NET

2023-03-17 02:51 问答作者：

I am looking for a language parser that can resolve somthing like the following:

(x=7 OR y=1) AN开发者_StackOverflowD (x>0)

I was thinking of using ANTLR Parser Generator. Is there a more simple Language Parser in more advanced .NET framework (.NET 3.5, .NET 4.0)?

If the problem is just an expression involving monadic and binary operators, parentheses, variable values and constant operands, you can write a recursive descent parser/evaluator that will both parse and evaluate the expression at the same time. No need to build trees or IL or...

[If you need to code more complex syntax such as statements and methods, you'll need a more complex parser, and then a parser generator pays off]

For just an expression, you can code the recursive descent parser directly from a BNF for your expression. Make sure you first left-factor the commonality for each rule, e.g., not

 SUM = TERM '+' TERM ;
 SUM = TERM '-' TERM ;

but

 SUM = TERM ( '+' TERM |  '-' TERM ) ;

For each left hand-side nonterminal, create a (possibly recursive) subroutine taking an index into the string to be parsed, that returns either the value of an expression (assume float), or throws a (syntax) error. For each right hand side token, you code a test; if the token isn't present, it passes control to an alternative, or throws a syntax error if no other alternatives. If the token is present, it evalutes the token: if a terminal value (e.g., number or varable), gets the value and advances the input; if a nonterminal, calls it to see if that syntax is present; if an operator, simply ignores it but advances the input. If your tests match all the elements of a right hand side, compute the expression result and return it.

So for the grammar:

EXP = SUM ;
SUM = TERM ( '+' TERM |  '-' TERM ) ;
TERM = PRIMARY ( '*' PRIMARY | '/' PRIMARY ) ;
PRIMARY = '-' PRIMARY | '(' EXP ')' | NUMBER | VARIABLE ;

And given a buffer of characters INPUT containing the expression, and global variable I an index into INPUT, the code is roughly:

float EXP()
{  return SUM();
}

float SUM()
{  float t=TERM();
   if MATCH("+") return t+TERM(); 
   if MATCH("-") return t-TERM();
   throw SYNTAXERROR;
}

float TERM()
{  float t= PRIMARY();
   if MATCH("*") return t*PRIMARY();
   if MATCH("/") return t/PRIMARY();
   throw SYNTAXERROR;
}

float PRIMARY()
{  float t;
   if MATCH("-") return -PRIMARY();
   if MATCH("(")
       { t=EXP();
         if MATCH(")") return t;
         else throw SYNTAXERROR
       }
   try t=NUMBER();
   catch SYNTAXERROR
      return VARIABLE();
   endtry
}

float NUMBER()  // simple float input conversion
{ float t=0;
  fractiondigits=0;
  exponent=0;
  switch INPUT(I)
  {  case "0".."9":
     { t=t*10+INPUT(I)-"0"; I++;
       while ISDIGIT(INPUT(I))
          { t=t*10+INPUT(I)-"0"; I++ }
       if MATCH(".")
          goto collect_fraction;
       else goto collect_exponent
      }
     case ".": goto collect_fraction;
     default: throw SYNTAXERROR
  }
  collect_fraction:
     while ISDIGIT(INPUT(I))
       { t=t*10+INPUT(I)-"0"; I++; fraction_digits++; }
  collect_exponent:
     if MATCH("E")
       { sign=false;
         if MATCH("-") sign=true;
         if !ISDIGIT(INPUT(I)) throw SYNTAXERROR;
         while ISDIGIT(INPUT(I))
           { exponent=exponent*10+INPUT(I)-"0"; I++; }
         if sign=true then exponenent=-exponent;
       }
       return t*10^(exponent-fractiondigits);
}

float VARIABLE() // handles single letter variable names.
{  if ISLETTER(INPUT(I))
   { I++;
     return VARIABLEVALUE[INPUT(I)-"A"]
   }
   else throw SYNTAXERROR
}

 boolean MATCH(c: char)
 {  if INPUT(I)==c
       {  I++;
          return true;
       }
    else return false

You'll obviously want to write a grammar for the expressions you want to evaluate. But assuming you are adding only relationals and AND, OR and NOT operators, following this style it should take you about 30 minutes to code the whole thing.

This doesn't account for how the input expression got collected into the INPUT buffer, nor does it address the issue of how variables got values; I've assumed that the map of variable names to values got magically filled somehow in advance. If you want to allow simple assignments as well as expressions, just extend the BNF a bit by adding a rule to allow assignments e.g.,

EXP = VARIABLE ':=' EXP ;

Processing the assignment will require a bit of trickiness: as you match the pieces and discover the VARIABLE, you'll need a way to capture the variable name (modify VARAIBLE to remember the variable name in a global), and where the syntax of the assignment rule has been recognized, update the map of variable names to value collected.

The float input code is a hack, and can produce slightly incorrect input values (but it was easy to code :) If you want more precise float input conversions, you should simply collect the characters that make up the float constant, and then hand them to a library string-to-float conversion routine.

If you just need simple expressions like in your example you can use NCalc. Fast and easy to use.

You can take a look at Irony and the SampleExpressionEvaluator in the exemples.

And here's a good article to start with (if you know the basics of defining grammars you'll be ready very quickly):

For simple expressions like the ones given, I would write a recursive descent parser. Write out your BNF first and then just code it out. You can take one of 3 approaches in terms of evaluation:

Emit expression trees
Emit IL
Roll your own bytecode VM

I'd go with (1) as it's the simplest approach.

继续阅读：.net parsing

Language Parser with conditions .NET

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？