ANF | Wonderings of a SAT geek

It is sometimes important to be able to quickly convert a binary function, described in as a truth table, to its Algebraic Normal Form (or simply, ANF). The truth table basically lists the points where a function is 1 and when it is 0. For instance, if the function has two variables, and its truth table is 0110 or, in hexadecimal form, 0x5 then the ANF of this function is simply x0 + x1, where x0 is the first variable of the function, and x1 is the second.

When it comes to more complex functions, an automatic method for this conversion is helpful. The Sage software can do exactly that:

sage: from sage.crypto.boolean_function import BooleanFunction
sage: B = BooleanFunction("0123456789ABCDEF")
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1*x3 + x0*x1*x4 + x0*x1*x5 + x0*x2
+ x0*x3 + x1*x2 + x1*x4 + x2 + 1

Here, we used Sage to convert the function given by the truth table (in hex) 0x0123456789ABCDEF to its ANF equivalent.

I have long searched for a method for this, but at the end it was Luk Bettale who showed me the way. Many thanks to him for this!

Algebraic Normal Form, or simply ANF, is the form used to describe cryptographic functions. It looks like this:
$a*b \oplus b*c \oplus b \oplus d = 0\\b*c \oplus c \oplus a = 0\\\ldots$
Where $a,\ldots d$ are binary variables, the $\oplus$ sign is binary XOR, and the $*$ sign is binary AND. Every item that is connected with the XOR-s is called a monomial, and its degree is the number of independent variables inside it. So, $a*b$ is a 2-degree monomial in the first equation, and $c$ is a 1-degree monomial in the second equation.

An important problem in SAT is to translate an ANF into the input of SAT solvers, Conjunctive Normal Form, or simply CNF. A CNF formula looks like this:
$a \vee b \vee c = 1\\a \vee \neg d = 1\\\ldots$
Where again $a,\ldots d$ are binary variables, the $\vee$ sign is the binary OR, and the $\neg$ sign is the binary NOT (i.e. inverse).

The scientific reference paper

The most quoted article about ANF-to-CNF translation is that by Courtois and Bard, which advocates for the following translation process:

Introduce internal variables for every monomial of degree larger than 1
Describe the equations as large XORs using the recently introduced variables

The example problem in CNF

According to the original method, the equations presented above are first translated to the following form:
$v1 = a*b\\v2 = b*c\\v1 \oplus v2 \oplus b \oplus d = 0\\v2 \oplus c \oplus a = 0$

Where $v1, v2$ are fresh binary variables. Then, each of the above equations are translated to CNF. The internal variables are translated as such:

Translation of $v1 = a*b$ :
$v1 \vee \neg a \vee \neg b = 1\\\neg v1 \vee a = 1\\\neg 1 \vee b = 1$
Translation of $v2 = b*c$
$v2 \vee \neg b \vee \neg c = 1\\\neg v2 \vee b = 1\\\neg v2 \vee c = 1$
Translation of $v1 + v2 + b + d = 0$ :
$\neg v1 \vee v2 \vee b \vee d = 1\\v1 \vee \neg v2 \vee b \vee d = 1\\v1 \vee v2 \vee \neg b \vee d = 1\\v1 \vee v2 \vee b \vee -d = 1\\\neg v1 \vee \neg v2 \vee \neg b \vee d = 1\\\neg v1 \vee \neg v2 \vee b \vee \neg d = 1\\\neg v1 \vee v2 \vee \neg b \vee \neg d = 1\\v1 \vee \neg v2 \vee \neg b \vee \neg d = 1$
Translation of $v2 + c + a = 0$ :
$v2 \vee c \vee \neg a = 1\\v2 \vee \neg c \vee a = 1\\\neg v2 \vee c \vee a = 1\\\neg v2 \vee \neg c \vee \neg a = 1$

We are now done. The final CNF file is this. This final CNF has a small header, and some fluffs have been removed: variables are not named, but referred to with a number, and the = true-s have been replaced with a line-ending 0.

As you can imagine, there are many ways to enhance this process. I have written a set of them down in this paper. The set of improvements in a nutshell are the following:

If a variable’s value is given, (e.g. a = true), first substitute this value in the ANF, and transcribe the resulting ANF to CNF.
If there are two monomials, such as: a*b + b in an equation, make a non-standard monomial (-a)*b from the two, and transcribe this to CNF. Since the CNF can use negations, this reduces the size of the resulting CNF
If the ANF can be described using Karnaugh maps shorter than with the method presented above, use that translation method.

An automated way

I just got a nice script to perform step (1) from Martin Albrecht, one of the developers of Sage:

sage: B = BooleanPolynomialRing(4500,'x')
sage: L = [B.random_element(degree=2,terms=10) 
      for _ in range(4000)]
sage: s = [B.gen(i) + GF(2).random_element() 
      for i in range(1000)]
sage: %time F = 
      mq.MPolynomialSystem(L+s).
      eliminate_linear_variables(maxlength=1)
CPU time: 1.03 s,  Wall time: 1.11 s

In this code, Martin generates a boolean system of equations with 4500 variables, with 4000 random equations each with randomly selected monomials of degree 2 and of XOR size 10. Then, he sets 1000 random variables to a random value (true or false), and finally, he substitutes the assigned values, and simplifies the system. If you think this is a trivial issue, alas, it is not. Both Maple and Sage take hours to perform it if using the standard eval function. The function above uses a variation of the ElimLin algorithm from the Courtois-Bard paper to do this efficiently.

Wonderings of a SAT geek

A blog about SAT solving and cryptography

Tag Archives: ANF

Truth table to ANF conversion

ANF to CNF conversion

The scientific reference paper

The example problem in CNF

An automated way