anf2cnf hell in Sage

There is an ANF (Algebraic Normal Form) to CNF (Conjunctive Normal Form) converter by Martin Albrecht in Sage. Essentially, it performs the ANF to CNF conversion that I have described previously in this blog entry. Me, as unsuspecting as anyone else, have been using this for a couple of days now. It seemed to do its job. However, today, I wanted to backport some of my ideas to this converter. And then it hit me.

Let me illustrate with a short example why I think something is wrong with this converter. We will try to encode that variable 0 and variable 1 cannot both be TRUE. This is as simple as saying x0*x1 = 0 in plain old math. In Sage this is done like this:

sage: B = BooleanPolynomialRing(10,'x')
sage: load anf2cnf.py
sage: anf2cnf = ANFSatSolver(B)
sage: polynom = B.gen(0)*B.gen(1)
sage: print polynom
x0*x1

So far, so good. Let’s try to make a CNF out of this:

sage: print anf2cnf.cnf([polynom])
p cnf 4 6
2 -4 0
3 -4 0
4 -2 -3 0
1 0
4 1 0
-4 -1 0

Oooops. Why do we need 6 clauses to describe this? It can be described with exactly one:

p cnf 2 1
-1 -2

This lonely clause simply bans the solution 1 = TRUE, 2 = TRUE, which was our original aim.

Let me just mention one more thing about this converter: it repeats definitions. For example:

sage: print two_polynoms
[x1*x2 + 1, x1*x2 + x1]
sage:  print anf2cnf.cnf(two_polynoms)
p cnf 4 13
2 -4 0
3 -4 0
4 -2 -3 0
1 0
4 0
2 -4 0
3 -4 0
4 -2 -3 0
1 0
4 2 1 0
-4 -2 1 0
-4 2 -1 0
4 -2 -1 0

Notice that clause 2 -4 0 and the two following it have been repeated twice, as well as the clause setting 1 to TRUE.

I have been trying to get around these problems lately. When ready, the new script will be made available, along with some HOWTO. It will have some minor shortcomings, but already, the number of clauses in problem descriptions have dramatically dropped. For example, originally, the description of an example problem in CNF contained 221’612 clauses. After minor corrections, the same can now be described with only 122’042 clauses. This of course means faster solving, cleaner and even human-readable CNF output, etc. Fingers are crossed for an early release ;)