开发者

Boolean Query / Expression to a Concrete syntax tree

I'开发者_开发百科m creating a search form that allows boolean expressions, like: "foo AND bar" or "foo AND NOT bar".

Is there a library for PHP, Ruby or Java that can transform boolean expressions to a concrete syntax tree?

(I could write my own lexer/parser, but I rather use something tried and tested)

EDIT: To clarify, I'm not parsing arrhythmic expressions. It's going to be used for parsing full text queries that allow boolean operators.


I recommend Treetop. It's a minilanguage that generates a parser for your PEG (Parsing Expression Grammar). It's easier to work with than a LALR grammar and more powerful than a recursive descent parser (i.e. it can do some lookaheads more than one symbol).

Although Treetop isn't that complex, it helps to have some simple examples to to start with. You will find them at threedaymonk's treetop examples.


For most parser framework, one of the standard introductory grammars is usually a mathematical expression parser. It parses things like 3+4*5-(6/-7), etc. Some tutorials would go even further and show you how to build the syntax tree and evaluate the expression.

A typical boolean expression grammar is usually directly analogous to mathematical expression grammar in this infix notation.

  • AND and OR are binary operators (i.e. they take 2 operands), just like + and *
  • NOT is a unary operator (it takes 1 operand), just like the unary -
  • Some operators have higher precedence than others
  • You may use (…) to group subexpressions

You should therefore be able to go through the calculator tutorial of most parser package and adapt it to your problem. In fact, in many ways boolean expression parsing is simpler because e.g.:

  • In mathematical expressions, the - operator is overloaded (may be binary or unary)
  • Some operator, e.g. exponentiation, is right-associative

In boolean expression grammar, usually the operators aren't overloaded, and everything is left-associative.

Links

  • antlr.org - Calculator Tutorial


I know this question is almost three years old now, but I recently put together a library in Java specifically to manipulate boolean expressions: jbool_expressions.

It includes a tool too parse expressions out of string input:

Expression<String> expr = ExprParser.parse("( ( (! C) | C) & A & B)")

You can also do some fairly simple simplification:

Expression<String> simplified = RuleSet.simplify(expr);
System.out.println(expr);

gives

(A & B)

Hope this helps.


just stumbled over this while looking for a sat solver... http://jboolexpr.sourceforge.net/ may be just what you need.


$patterns=array(
   '/\band\b/i',
   '/\bor\b/i',
   '/\bfoo\b/i',
   '/\bbar\b/i',
   '/\bnot\b/i');
$replace=array(
   '*',  // value for 'and'
   '+',  // value for 'or'
   '1',  // value for 'foo'
   '0',  // value for 'bar'
   '!'); // value for 'not'
$expression="foo AND NOT bar";
$result=eval(preg_replace($pattern, $replace, $expression));

You can easily add your own facts, it will automatically handle brackets and precedence. It might be a good idea to think about how you'll handle unexpected inputs in $expression (hint the replaced version should only contain 1, 0, +, *, (, ) and !)

C.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜