# On benchmark randomization

As many of you have heard, the SAT Competition for this year has been announced. You can send in your benchmarks between the 12th and the 22nd of April, so get started. I have a bunch of benchmarks I have already submitted about 2 years ago, still waiting for any reply from those organizers — but the organizers are different this year, so fingers crossed.

What I want to talk about today is benchmark randomization. This is a very-very touchy topic. However, I fear that it’s touchy for the wrong reasons, and so I think it’s important to talk about it in detail.

## What is benchmark randomization?

Benchmark randomization is when a benchmark that is submitted is shuffled around a bit. There are many ways to shuffle a problem, and I will discuss this in a bit, but the point is that the problem at hand that is described by the benchmark CNF should not be changed, or changed only in a very-very minor way, such that everyone agrees that it doesn’t affect the core problem itself as described by the CNF.

## Why do we need shuffling?

We need shuffling because simply put, there aren’t enough good benchmarks and so the benchmarks of yesteryear (and the year before, and before, and…) re-appear often. This would be OK if SAT solvers couldn’t be tuned to solving specific problems faster. Note that I am not suggesting that SAT solvers are intentionally manipulated to solve specific problems faster by unscrupulous researchers. Instead, the following happens.

## Unintentional random seed improvements

Researchers test the performance of their SAT solvers on specific instances and then tune their solvers, testing the performance again and again on the same instances to check if they have improved performance. Logically this is the best way to test and improve performance: use the same well-defined test-set all the time for meaningful comparison. Since the researcher wants to use the instances that he/she thinks is the current use-case of SAT solvers, he naturally uses the instances of SAT competitions, since those are representative. I did and still do the same.

So, researchers add their idea to a SAT solver, and test. If the idea is not improving things then some change is made and tested again. Since modern CDCL SAT solvers behave quite randomly, and since any change in the source code changes the behaviour quite significantly, a small change in the source code (tuning of a parameter, for example) will change the behaviour. And since the set of problems tested on is fixed, there is a chance that more problems will be solved. If more are solved, the researcher might correctly interpret this as a general improvement, not specific to the problem set. However, it may very well be generic, it is also specific.

The above suggests that the randomness of the SAT solver is completely unintentionally tuned to specific problems — a subset of which will appear next year in the competition.

## Easy fixes

Since there aren’t enough benchmark problems, and in particular some benchmark types are rare, I suggest to fix the unintentional tuning of solvers to specific problems by changing the benchmarks in minor ways. Here is a list, with an explanation why I think it’s OK to perform the manipulation:

1. Propagate variables. Unitary clauses are often part of benchmarks. Propagating some of these, some recursively, gives quite a bit of problem space variation. Propagation is performed by every CDCL SAT solver, and I think many would be  surprised if it didn’t help SAT solvers that worked differently than  current SAT solvers. Agreeing on performing partial propagation is something that shouldn’t be too difficult.
2. Renumber variables. For some variable X that is not used (or is fixed to a value that has been propagated), every variable that is higher than X is decremented by one, and the CNF header is fixed to reflect this change.  Such a minor renumbering may be approved by every researcher as something that doesn’t change the problem or its structure. Note that if  partial propagation is performed there should be quite a number of variables that can be removed. Renumbering some, but not others is a way to shuffle the problem. A more radical way of renumbering variables would be to completely shuffle them, however that would change the way the problem is described in quite a radical way, so some would correctly object and it’s not necessary anyway.
3. Replace equivalent literals. Perform strongly connected component analysis and replace equivalent literals. This has been shown to significantly improve performance and I have never seen a case where it doesn’t. Since equivalent literal replacement can be performed with a lot of freedom, this is quite a bit of shuffling space. For example, if v1=v2=v3, then any of the v1, v2, v3 can be the one that replaces the rest in the CNF. Picking one randomly is a way to shuffle the instance

There are other ways of shuffling, but either they change the instance too much (e.g. blocked clause removal), or can be undone quite easily (e.g. shuffling the order of the clauses). In fact, (3) is already quite a touchy issue I think, but with (1) and (2) all could agree on. Neither requires the order of the literals or the order of the clauses to change — some clauses (e.g. unitary ones) and literals (some of those that are set) would be removed, but that’s all. The problem remains essentially unchanged such that most probably even the original problem author would easily recognize it. However, it would be different from a SAT solver point of view: these changes would change the random seed of the solver, forcing the solver to behave in a way that is less tuned to this specific problem instance.

## Conclusion

SAT solvers are currently tuned too much to specific instances. This is not intentional by the researchers, however it still affects the results. To obtain better, less biased results we should shuffle the problem instances we have. Above, I suggested three ways to shuffle the instances in such a way that most would agree they don’t disturb or change the complexity of the underlying problem described by the instance. I hope that some of these suggestions will be employed, if not this year then for next year’s SAT competition such that we could reach better, more meaningful results.

# anf2cnf script released

I have finally managed to fix the script that converts ANF problems to CNF format in the Sage math system. The original script was having some problems that I blogged about. The new script has corrected most of the shortcomings of the original script, as well as added some textual help for the user.

For instance, the equations

that last time required 13 clauses and 4 variables in CNF, now look like this:

which is 1 variable and 7 clauses shorter than the original, not to mention the visually cleaner look and human-parseable output. The new script is available here. Hopefully, some of my enhancements included in the Grain-of-Salt package will be included in this script. The problem is mainly that Grain-of-Salt uses radically different data structures, and is written in a different programming language, so porting is not trivial.

# anf2cnf hell in Sage

There is an ANF (Algebraic Normal Form) to CNF (Conjunctive Normal Form) converter by Martin Albrecht in Sage. Essentially, it performs the ANF to CNF conversion that I have described previously in this blog entry. Me, as unsuspecting as anyone else, have been using this for a couple of days now. It seemed to do its job. However, today, I wanted to backport some of my ideas to this converter. And then it hit me.

Let me illustrate with a short example why I think something is wrong with this converter. We will try to encode that `variable 0` and `variable 1` cannot both be TRUE. This is as simple as saying `x0*x1 = 0` in plain old math. In Sage this is done like this:

So far, so good. Let’s try to make a CNF out of this:

Oooops. Why do we need 6 clauses to describe this? It can be described with exactly one:

This lonely clause simply bans the solution `1 = TRUE, 2 = TRUE`, which was our original aim.

Notice that clause `2 -4 0` and the two following it have been repeated twice, as well as the clause setting `1` to TRUE.

I have been trying to get around these problems lately. When ready, the new script will be made available, along with some HOWTO. It will have some minor shortcomings, but already, the number of clauses in problem descriptions have dramatically dropped. For example, originally, the description of an example problem in CNF contained 221’612 clauses. After minor corrections, the same can now be described with only 122’042 clauses. This of course means faster solving, cleaner and even human-readable CNF output, etc. Fingers are crossed for an early release ;)

# ANF to CNF conversion

Algebraic Normal Form, or simply ANF, is the form used to describe cryptographic functions. It looks like this:
$a*b \oplus b*c \oplus b \oplus d = 0\\b*c \oplus c \oplus a = 0\\\ldots$
Where $a,\ldots d$ are binary variables, the $\oplus$ sign is binary XOR, and the $*$ sign is binary AND. Every item that is connected with the XOR-s is called a monomial, and its degree is the number of independent variables inside it. So, $a*b$ is a 2-degree monomial in the first equation, and $c$ is a 1-degree monomial in the second equation.

An important problem in SAT is to translate an ANF into the input of SAT solvers, Conjunctive Normal Form, or simply CNF. A CNF formula looks like this:
$a \vee b \vee c = 1\\a \vee \neg d = 1\\\ldots$
Where again $a,\ldots d$ are binary variables, the $\vee$ sign is the binary OR, and the $\neg$ sign is the binary NOT (i.e. inverse).

### The scientific reference paper

The most quoted article about ANF-to-CNF translation is that by Courtois and Bard, which advocates for the following translation process:

1. Introduce internal variables for every monomial of degree larger than 1
2. Describe the equations as large XORs using the recently introduced variables

### The example problem in CNF

According to the original method, the equations presented above are first translated to the following form:
$v1 = a*b\\v2 = b*c\\v1 \oplus v2 \oplus b \oplus d = 0\\v2 \oplus c \oplus a = 0$

Where $v1, v2$ are fresh binary variables. Then, each of the above equations are translated to CNF. The internal variables are translated as such:

1. Translation of $v1 = a*b$:
$v1 \vee \neg a \vee \neg b = 1\\\neg v1 \vee a = 1\\\neg 1 \vee b = 1$
2. Translation of $v2 = b*c$
$v2 \vee \neg b \vee \neg c = 1\\\neg v2 \vee b = 1\\\neg v2 \vee c = 1$
3. Translation of $v1 + v2 + b + d = 0$:
$\neg v1 \vee v2 \vee b \vee d = 1\\v1 \vee \neg v2 \vee b \vee d = 1\\v1 \vee v2 \vee \neg b \vee d = 1\\v1 \vee v2 \vee b \vee -d = 1\\\neg v1 \vee \neg v2 \vee \neg b \vee d = 1\\\neg v1 \vee \neg v2 \vee b \vee \neg d = 1\\\neg v1 \vee v2 \vee \neg b \vee \neg d = 1\\v1 \vee \neg v2 \vee \neg b \vee \neg d = 1$
4. Translation of $v2 + c + a = 0$ :
$v2 \vee c \vee \neg a = 1\\v2 \vee \neg c \vee a = 1\\\neg v2 \vee c \vee a = 1\\\neg v2 \vee \neg c \vee \neg a = 1$

We are now done. The final CNF file is this. This final CNF  has a small header, and some  fluffs have been removed: variables are not named, but referred to with a number, and the `= true`-s have been replaced with a line-ending `0`.

As you can imagine, there are many ways to enhance this process. I have written a set of them down in this paper. The set of improvements in a nutshell are the following:

1. If a variable’s value is given, (e.g. `a = true`), first substitute this value in the ANF, and transcribe the resulting ANF to CNF.
2. If there are two monomials, such as: `a*b + b` in an equation, make a non-standard monomial `(-a)*b` from the two, and transcribe this to CNF. Since the CNF can use negations, this reduces the size of the resulting CNF
3. If the ANF can be described using Karnaugh maps shorter than with the method presented above, use that translation method.

### An automated way

I just got a nice script to perform step (1) from Martin Albrecht, one of the developers of Sage:

In this code, Martin generates a boolean system of equations with 4500 variables, with 4000 random equations each with randomly selected monomials of degree 2 and of XOR size 10. Then, he sets 1000 random variables to a random value (true or false), and finally, he substitutes the assigned values, and simplifies the system. If you think this is a trivial issue, alas, it is not. Both Maple and Sage take hours to perform it if using the standard `eval` function. The function above uses a variation of the ElimLin algorithm from the Courtois-Bard paper to do this efficiently.

# Could monomials be handled natively from SAT solvers?

I recently got a question that intrigued me:

I am new to this SAT solving world but I was wondering whether you thought considerable speedups were possible for crypto type problems (multivariate polys over GF(2)) by simply never converting the problem to cnf at all and thereby avoiding the combinatorial explosion that results in the conversion process. That is using the original xor formulation.

First of all, the question is a follow-up to xor-clauses: they implement XOR-s natively. Using them avoids a number of problems relating to the increase of variables. Why not implement monomials (i.e. “a*b” or “a*b*c”, where “*” is binary AND and variables are binary) natively? They are the only thing left to do. Personally, I am not overly optimistic about them, though. Let me got through some of my reasons here.

Firstly, the “exponential explosion” expressed in the question is in fact much less existent than people tend to think. The reason is that the intelligent variable activity heuristics, unit propagation, and conflict generation tend to take care of a lot of potential problems. Since the propagation of a variable will entail the propagation of many others (it depends, for crypto, around ~100), there is no real explosion, since there is not really 2^n, but more like 2^(n/100) combinations that need to be explored. This argumentation takes away some of the potential benefits that native monomials could bring.

The real problem, though, is the following. By moving monomials into cryptominisat and thus potentially speeding up the solving, conflict generation could become much more complex. So, if moving to an internal monomial representation entails making a mess of conflict generation, then using monomials internally may only make the solving slower.

Another reason that native monomials may not speed up solving so much, is that a lot of clauses inserted when converting monomials are binary clauses, which are extremely well dealt with in the CNF world — it would be hard to do it any better.

As a last, but very minor point, using monomials would increase the complexity of the program, which would mean not only a lot of man-hours lost debugging it, but also a loss of performance due to a (probably non-negligible) increase of instruction cache misses.

Oh well, so those are my reasons. I would be interested if someone has some comments on these, though.