ANF to CNF conversion | Wonderings of a SAT geek

Algebraic Normal Form, or simply ANF, is the form used to describe cryptographic functions. It looks like this:
$a*b \oplus b*c \oplus b \oplus d = 0\\b*c \oplus c \oplus a = 0\\\ldots$
Where $a,\ldots d$ are binary variables, the $\oplus$ sign is binary XOR, and the $*$ sign is binary AND. Every item that is connected with the XOR-s is called a monomial, and its degree is the number of independent variables inside it. So, $a*b$ is a 2-degree monomial in the first equation, and $c$ is a 1-degree monomial in the second equation.

An important problem in SAT is to translate an ANF into the input of SAT solvers, Conjunctive Normal Form, or simply CNF. A CNF formula looks like this:
$a \vee b \vee c = 1\\a \vee \neg d = 1\\\ldots$
Where again $a,\ldots d$ are binary variables, the $\vee$ sign is the binary OR, and the $\neg$ sign is the binary NOT (i.e. inverse).

The scientific reference paper

The most quoted article about ANF-to-CNF translation is that by Courtois and Bard, which advocates for the following translation process:

Introduce internal variables for every monomial of degree larger than 1
Describe the equations as large XORs using the recently introduced variables

The example problem in CNF

According to the original method, the equations presented above are first translated to the following form:
$v1 = a*b\\v2 = b*c\\v1 \oplus v2 \oplus b \oplus d = 0\\v2 \oplus c \oplus a = 0$

Where $v1, v2$ are fresh binary variables. Then, each of the above equations are translated to CNF. The internal variables are translated as such:

Translation of $v1 = a*b$ :
$v1 \vee \neg a \vee \neg b = 1\\\neg v1 \vee a = 1\\\neg 1 \vee b = 1$
Translation of $v2 = b*c$
$v2 \vee \neg b \vee \neg c = 1\\\neg v2 \vee b = 1\\\neg v2 \vee c = 1$
Translation of $v1 + v2 + b + d = 0$ :
$\neg v1 \vee v2 \vee b \vee d = 1\\v1 \vee \neg v2 \vee b \vee d = 1\\v1 \vee v2 \vee \neg b \vee d = 1\\v1 \vee v2 \vee b \vee -d = 1\\\neg v1 \vee \neg v2 \vee \neg b \vee d = 1\\\neg v1 \vee \neg v2 \vee b \vee \neg d = 1\\\neg v1 \vee v2 \vee \neg b \vee \neg d = 1\\v1 \vee \neg v2 \vee \neg b \vee \neg d = 1$
Translation of $v2 + c + a = 0$ :
$v2 \vee c \vee \neg a = 1\\v2 \vee \neg c \vee a = 1\\\neg v2 \vee c \vee a = 1\\\neg v2 \vee \neg c \vee \neg a = 1$

We are now done. The final CNF file is this. This final CNF has a small header, and some fluffs have been removed: variables are not named, but referred to with a number, and the = true-s have been replaced with a line-ending 0.

As you can imagine, there are many ways to enhance this process. I have written a set of them down in this paper. The set of improvements in a nutshell are the following:

If a variable’s value is given, (e.g. a = true), first substitute this value in the ANF, and transcribe the resulting ANF to CNF.
If there are two monomials, such as: a*b + b in an equation, make a non-standard monomial (-a)*b from the two, and transcribe this to CNF. Since the CNF can use negations, this reduces the size of the resulting CNF
If the ANF can be described using Karnaugh maps shorter than with the method presented above, use that translation method.

An automated way

I just got a nice script to perform step (1) from Martin Albrecht, one of the developers of Sage:

sage: B = BooleanPolynomialRing(4500,'x')
sage: L = [B.random_element(degree=2,terms=10) 
      for _ in range(4000)]
sage: s = [B.gen(i) + GF(2).random_element() 
      for i in range(1000)]
sage: %time F = 
      mq.MPolynomialSystem(L+s).
      eliminate_linear_variables(maxlength=1)
CPU time: 1.03 s,  Wall time: 1.11 s

In this code, Martin generates a boolean system of equations with 4500 variables, with 4000 random equations each with randomly selected monomials of degree 2 and of XOR size 10. Then, he sets 1000 random variables to a random value (true or false), and finally, he substitutes the assigned values, and simplifies the system. If you think this is a trivial issue, alas, it is not. Both Maple and Sage take hours to perform it if using the standard eval function. The function above uses a variation of the ElimLin algorithm from the Courtois-Bard paper to do this efficiently.