Algorithms for optimizing conjunctive normal form expressions for particular instruction sets?
I'm using Espresso logic minimizer to produce a minimized form of a set of boolean equations. However rather than generating logic for a programmable array logic (which is what Espresso is normally used for), I am looking开发者_StackOverflow中文版 to implement these on a standard microprocessor. The trouble is that Espresso produces output in conjunctive normal form, which is perfect for a PAL but non-optimal for an x86 or PPC.
For instance, Espresso ignores XOR entirely - in the below Espresso output, the subexpression (!B0&!B1&B2&!B3) | (!B0&!B1&!B2&B3)
is equivalent to (!B0&!B1&(B2^B3))
. This substitution does increase the gate depth / critical path of the expression, but given that I'm looking at expressions with a sufficient number of terms to completely saturate the execution resources of any CPU around it seems reasonable to trade off some gate depth for reducing the overall # of instructions. I'd also like to extend it to understand how to use instructions like ANDC or NOR which are available on some processors of interest to me.
Example of the CNF expressions I'm looking at:
O0 = (B0&!B1&!B2&B3) | (!B0&B1&B2&B3) | (!B0&!B1&B2&B3) | (B1&!B3) | (!B0 &!B2&!B3);
O1 = (B0&B1&!B2&B3) | (B0&!B1&B2&!B3) | (B0&B1&B2&!B3) | (!B0&!B1&B2&!B3) | (!B0&!B1&!B2&B3) | (!B0&!B1&B2&B3) | (!B0&!B2&!B3);
O2 = (B0&!B1&!B2&B3) | (B0&!B1&B2&!B3) | (B0&B1&B2&B3) | (!B0&B1&!B3) | (!B0&!B2&B3) | (!B0&!B1&B2&B3);
O3 = (!B0&B1&!B2&!B3) | (B0&B1&B2&B3) | (!B0&B1&B2&B3) | (B0&B1&B2&!B3) | (B0&!B1&!B2) | (!B0&!B1&B2&!B3) | (!B0&!B1&!B2&B3);
So, to make this an actual question; in order of preference:
Do you know of an option or extension of Espresso that will produce the kind of expressions I want?
Do you know of any tool for boolean logic minimization that understands (or can be taught) various gate types, rather than only producing CNF for PALs?
Do you know of an algorithm for converting from CNF expressions like the ones above to expressions using additional types of gates?
If you don't know of an algorithm for it, do you know of, or can think of, any useful heuristics in doing this?
(And, just in case you were going to suggest it - testing shows that GCC and ICC (or, I would bet, any other C compiler in existence) aren't smart enough to do the processor specific minimization for me from the CNF expressions - that would be really really nice, but examining the output of -O3 -S for both of them shows they can't even catch the cases where XOR can be used).
The most well-known algorithm for Boolean formula minimisation is the Quine-McCluskey algorithm, which yields the smallest DNF formula, but is computational expensive (necessarily, since the problem is outside PTIME, cf. The complexity of Boolean formula minimization, 2007). There's a literate Java implementation; the basic concept is crucial to Prolog, so if you have any experience with Prolog, the idea should come easily enough.
Postscript There's an IEEE-paywalled article, Extending Quine-McCluskey for exclusive-or logic synthesis, abstract:
Various forms of Boolean minimization have been used within electronic engineering degrees as a key part of the syllabus. Typically, Karnaugh maps and Quine-McCluskey methods are the principal exhaustive search techniques for digital minimization at the undergraduate level as they are easy to use and simple to understand. Despite the popularity of these methods, they are not well suited to typical digital circuits. Simple examples of such circuits are parity, adders, gray code generators, and so on. The common factor among these is the Exclusive-Or logic gate. This problem is exacerbated by the increasing importance of Exclusive-Or in modern design. This paper proposes an extension to the Quine-McCluskey method which successfully incorporates Exclusive-Or gates within the minimization process. A number of examples are given to demonstrate the effectiveness of this approach. This technique is easy to master as it can be considered to be an extension to the Quine-McCluskey method.
I had been thinking of how to extend the method before I looked this up: you should be able to synthesise applications of XOR by using an alternate version of resolution that corresponds to them. For example, for a disjunctive clause F
in a CNF, which does not contain either of the atoms A
, or B
, from the clauses F | A | ~B
and F | ~A | B
, then you can replace them by F | XOR(A,B)
.
1) Have you considered using Superoptimization to choose instruction sequences for you?
As an example, the GNU superoptimizer manufactures extremely short instruction sequences that will compute the equivalent of a function you provide to it. In your case, you'd provide a function implementing the boolean equation of interest.
These tools often operate by enumerating the space of possible computations starting with the smallest, and deciding if an individual computation matches your computation goal. They can't necessarily provide solutions (let alone optimal ones) for very complex instructions, but if a modest number of instructions will succeed they often find a (unusual!) and very efficient sequence. XOR/NOR/ANDC would easily be in scope of the GNU superoptimizer.
2) You might try using an algebraic simplifier provided with the right set of algebraic equivalences. We've used our DMS Software Reengineering Toolkit, a program transformation engine that accepts arbitrary rewrite rules and understands commutative and associative laws, to implement boolean simplifiers that involve various operators such as XOR and NOR. You need the rule applier, and a hill-climbing algorithm (climbing the hill with the least number of operators) and a backtracking algorithm. With a depth-first iterative deepening search you can find an optimal solution if the expression isn't too complex; with a branch and bound search search, you can find a solution quickly and then attempt to minimize its size. You even have a relatively good hueristic measure: operands so far not included in the computation. The biggest problem with an equational simplifier is that it won't take into account the register constraints or opportunities available with your specific instruction set.
3) You can implement your own search (iterative deepening, branch and bound) over the sets of boolean instructions available to you, and include the constraints. (This is getting somewhat back to what the superoperoptimizers do). I've done this to compute minimal sequences of instructions to implement multiply-by-constant on x86, accounting for up to 3 registers and taking advantage of 3-operand instructions such as (load effective address) LEA X,[Y+K*Z] on registers X, Y, Z with constant K=1,2,4,8, ADD X,Y, SUB X,Y, MOV and NEG instructions. If you code this as a recursive program in any reasonable language, you can code one in a few hundred lines. (It produces some truly squirrely sequences).
精彩评论