Algebraic Normal Form, or simply ANF, is the form used to describe cryptographic functions. It looks like this:

Where are binary variables, the sign is binary XOR, and the sign is binary AND. Every item that is connected with the XOR-s is called a monomial, and its degree is the number of independent variables inside it. So, is a 2-degree monomial in the first equation, and is a 1-degree monomial in the second equation.

An important problem in SAT is to translate an ANF into the input of SAT solvers, Conjunctive Normal Form, or simply CNF. A CNF formula looks like this:

Where again are binary variables, the sign is the binary OR, and the sign is the binary NOT (i.e. inverse).

### The scientific reference paper

The most quoted article about ANF-to-CNF translation is that by Courtois and Bard, which advocates for the following translation process:

- Introduce internal variables for every monomial of degree larger than 1
- Describe the equations as large XORs using the recently introduced variables

### The example problem in CNF

According to the original method, the equations presented above are first translated to the following form:

Where are fresh binary variables. Then, each of the above equations are translated to CNF. The internal variables are translated as such:

- Translation of :

- Translation of

- Translation of :

- Translation of :

We are now done. The final CNF file is this. This final CNF has a small header, and some fluffs have been removed: variables are not named, but referred to with a number, and the `= true`

-s have been replaced with a line-ending `0`

.

As you can imagine, there are many ways to enhance this process. I have written a set of them down in this paper. The set of improvements in a nutshell are the following:

- If a variable’s value is given, (e.g.
`a = true`

), first substitute this value in the ANF, and transcribe the resulting ANF to CNF. - If there are two monomials, such as:
`a*b + b`

in an equation, make a non-standard monomial`(-a)*b`

from the two, and transcribe this to CNF. Since the CNF can use negations, this reduces the size of the resulting CNF - If the ANF can be described using Karnaugh maps shorter than with the method presented above, use that translation method.

### An automated way

I just got a nice script to perform step (1) from Martin Albrecht, one of the developers of Sage:

1 2 3 4 5 6 7 8 9 |
sage: B = BooleanPolynomialRing(4500,'x') sage: L = [B.random_element(degree=2,terms=10) for _ in range(4000)] sage: s = [B.gen(i) + GF(2).random_element() for i in range(1000)] sage: %time F = mq.MPolynomialSystem(L+s). eliminate_linear_variables(maxlength=1) CPU time: 1.03 s, Wall time: 1.11 s |

In this code, Martin generates a boolean system of equations with 4500 variables, with 4000 random equations each with randomly selected monomials of degree 2 and of XOR size 10. Then, he sets 1000 random variables to a random value (true or false), and finally, he substitutes the assigned values, and simplifies the system. If you think this is a trivial issue, alas, it is not. Both Maple and Sage take hours to perform it if using the standard `eval`

function. The function above uses a variation of the ElimLin algorithm from the Courtois-Bard paper to do this efficiently.