Detecting equivalent expressions

2023-03-23 02:18 问答作者：

I'm currently working on a Java a开发者_运维问答pplication where I need to implement a system for building BPF expressions. I also need to implement mechanism for detecting equivalent BPF expressions.

Building the expression is not too hard. I can build a syntax tree using the Interpreter design pattern and implement the toString for getting the BPF syntax.

However, detecting if two expressions are equivalent is much harder. A simple example would be the following:

A: src port 1024 and dst port 1024
B: dst port 1024 and src port 1024

In order to detect that A and B are equivalent I probably need to transform each expression into a "normalized" form before comparing them. This would be easy for above example, however, when working with a combination of nested AND, OR and NOT operations it's getting harder.

Does anyone know how I should best approach this problem?

One way to compare boolean expressions may be to convert both to the disjunctive normal form (DNF), and compare the DNF. Here, the variables would be Berkeley Packet Filter tokens, and the same token (e.g. port 80) appearing anywhere in either of the two expressions would need to be assigned the same variable name.

There is an interesting-looking applet at http://www.izyt.com/BooleanLogic/applet.php - sadly I can't give it a try right now due to Java problems in my browser.

I'm pretty sure detecting equivalent expressions is either an np-hard or np-complete problem, even for boolean-only expressions. Meaning that to do it perfectly, the optimal way is basically to build complete tables of all possible combinations of inputs and the results, then compare the tables.

Maybe BPF expressions are limited in some way that changes that? I don't know, so I'm assuming not.

If your problems are small, that may not be a problem. I do exactly that as part of a decision-tree designing algorithm.

Alternatively, don't try to be perfect. Allow some false negatives (cases which are equivalent, but which you won't detect).

A simple approach may be to do a variant of the normal expression-evaluation, but evaluating an alternative representation of the expression rather than the result. Impose an ordering on commutative operators. Apply some obvious simplifications during the evaluation. Replace a rich operator set with a minimal set of primitive operators - e.g. using de-morgans to eliminate OR operators.

This alternative representation forms a canonical representation for all members of a set of equivalent expressions. It should be an equivalence class in the sense that you always find the same canonical form for any member of that set. But that's only the set-theory/abstract-algebra sense of an equivalence class - it doesn't mean that all equivalent expressions are in the same equivalence class.

For efficient dictionary lookups, you can use hashes or comparisons based on that canonical representation.

I'd definitely go with syntax normalization. That is, like aix suggested, transform the booleans using DNF and reorder the abstract syntax tree such that the lexically smallest arguments are on the left-hand side. Normalize all comparisons to < and <=. Then, two equivalent expressions should have equivalent syntax trees.

继续阅读：algorithm design-patterns

Detecting equivalent expressions

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？