# The variable speed of SAT solving

Timings from the article "Attacking Bivium Using SAT Solvers". The authors didn't seem to have randomised the problems enough: the time to solve should increase exponentially, but instead it goes up and down like a roller coaster

Vegard Nossum asked me about the varying time it took for CryptoMiniSat to solve a certain instance that was satisfiable. This inspired me to write an overly long reply, which I think might interest others. So, why does the solving time vary for a specific instance if we permutate the clauses? Intuitively, just by permutating the clauses, the problem itself doesn’t get any easier or harder, so why should that make any difference at all? I will now try to go through the reasons, though they can almost all be summed up as follows: the SAT solver has absolutely no clue what it is solving, and so to solve a problem, it relies on heuristics and randomisation, both of which are influenced by its starting seed, which in turn is influenced by the order of the clauses in the CNF file. And now on to the long explanation.

Let’s divide problems into two categories: satisfiable and unsatisfiable instances. First, let me talk about satisfiable instances. Let’s suppose that we have been solving the problem for some time, and we have connected the dots (variables) well enough through learnt clauses. All that the SAT solver is waiting for is a good guess to set 3-4 variables right, and then it will propagate all variables to the right setting to satisfy all clauses. The problem is therefore to get to the point of only needing to guess 3-4 variables, i.e. to connect the dots, which we do through resolution. We can sacrifice some or all of dot-connecting (i.e. resolution) with some more guessing, and this can in fact be coded down by simply foregoing some of the conflict analysis we do. In the extreme case, all conflict analysis can be disabled, and then the solver would not restart its search. The solver would in essence be brute-forcing the instance through BCP (boolean constraint propagation). It is easy to see why this latter is prone to variation: depending on the ordering of the variables in the search tree, the tree could be orders of magnitude smaller or larger, and the first solution can be at any point in the search tree, essentially at a random place.

If we decide to carry out resolution for satisfiable problems to counter the problem of the variation of the search-tree, it is interesting to realise the following: in most cases, we can not forgo guessing. The reason is simple yet leads to quite interesting properties. Essentially, a SAT instance can have multiple solutions. If there are two solutions, e.g. `01100...` and `10011...` i.e. the solutions are the complete inverse of one another, then the SAT solver will not be able to prove any variable to any value. The best it could do is to create binary clauses, in the example case for instance

`var1=0 -> var2=1, var3=1, var4=0...`
and
`var1=1 -> var2=0, var3=0, var4=1...`

If we do enough resolutions, the “guessing” part will eventually become a solution-selector, i.e. it will select a solution from the set of available solutions. If there are 2^20, evenly distributed in the search space, we might need to set 20 variables before a solution is found through a simple application of BCP. Naturally, SAT solvers don’t normally do this, as they are content in finding one satisfying assignment, reporting that to the user, and exiting. It would be interesting to know the average remaining number of variables that needed to be guessed to solve the solution at the point the SAT solver actually found the solution, but this has not been done yet as far as I know. Then, we would know the trade-off the SAT solver employs between resolution and searching when solving satisfiable instances.

All right, so much for satisfiable instances. What happens with UNSAT instances? Well, the solver must either go through the whole tree and realise there is no solution, or do resolution until it reaches an empty clause, essentially building a resolution tree with an empty clause at its root. Since both of these can be done at the same time, there is a similar trade-off as above, but this time it’s somewhat upside-down. First of all, the search tree can be smaller or larger depending on the ordering of the variables (as before), and secondly, the resolution tree can be smaller or larger depending on the order of resolutions. The minimal resolution tree is (I believe) NP-hard to find, which doesn’t help, but there is at least a minimum resolution tree that limits us, and there is a minimum search tree which we must go through completely, that limits us. So, in contrast to finding satisfying solutions, both of these are complete in some sense, which should make searching for them robust in terms of speed. Finding a satisfying solution is not complete, because, as I noted above, SAT solvers don’t find all satisfying solutions — if they did, they would actually have the same properties as solving an unsatisfiable problem, as they would have to prove that there are no more solutions remaining, essentially proving unsatisfiability.

The above reasoning is probably the reason why SAT solvers tend to behave more robustly when they encounter an unsatisfiable problem. In particular, CryptoMiniSat’s running time seems more robust when solving unsatisfiable problems. Also, for problems that contain a lot of randomly placed solutions in the search space, such as the (in)famous vmpc_33 and vmpc_34 problems, the solving times seem wholly unpredictable for at least two interesting SAT solvers: lingeling and CryptoMiniSat.

# Extended resolution is working!

The subtitle of this post should really say: and I was wrong, again. First, let me explain what I have been wrong about — wrong assumptions are always the most instructive.

I have been convinced for the past year that it’s best to have two type of restart and corresponding learnt clause usefulness strategies: the glue-based restart&garbage collection of Glucose and a variation of the geometric restart and activity-based garbage collection of MiniSat. The glue-based variation was observed to be very good for industrial instances with many variables, many of which can be fixed easily, and the MiniSat-type variation was observed to be good for cryptography-type instances with typically low number of variables, almost none of which can be fixed. However, I had the chance to talk to Armin Biere a couple of weeks ago, and while talking with him, I have realised that maybe one of Lingeling’s advantages was its very interesting single restart strategy, based on agility (presentation, PDF) which seemed to let it use glue-based garbage collection all the time, succeeding to solve both types of instances. So, inspired by the CCC crowd, in only 10 minutes I have implemented a simple version of this restart strategy into CryptoMiniSat, and voilĂˇ, it seems to work on both type of instances, using only glues!

Now that this dumb assumption of mine is out of the way, let me talk about what is really interesting. As per my last post, I have managed to add a lightweight form of Extended Resolution (ER). Since then, I have improved the heuristics here-and-there, and I have launched the solver on a somewhat strange problem (aloul-chnl11-13.cnf) that only has 2×130 variables, but seems to be hard: very few SAT solvers were able to solve this instance within 10000 secs during the last SAT Competition. So, after CryptoMiniSat cutting the problem into two distinct parts, each containing 130 variables, it started adding variables to better express the learnt clauses… and the average learnt clause size dropped from ~30 literals to ~13 literals in comparison with the version that didn’t add variables. Furthermore, and this is the really interesting part, the solver was able to prove that some of these newly introduced variables must have a certain value, or that some of them were equi- or antivalent to others. So, after lightening the problem up to contain ~3000 variables, the solver was able to solve it in less than 5 minutes. This is, by the way, a problem that neither the newest MiniSat, nor Lingeling, nor Glucose can solve. Interestingly, the Extended Resolution version of Glucose can solve it, in almost the same time as CryptoMiniSat with ER added…

The Chaos Computer Club (CCC) is having its yearly conference starting today. It’s a madhouse here, which is great for ideas, so I have been having this rush of ideas implemented. First off, it seems that complex problems with few variables are way too difficult to solve even with distributed solving. I have been trying to solve some difficult problems, but no matter how much computer power I throw at them (and I have been throwing >2 years’ worth, with >300 CPU cores running), not much progress is being made. I have lately been attributing this “failure” to one prime problem: lack of expressiveness.

Basically, I feel like the SAT solver is trying to express some complex functions with learnt clauses. For instance, an adder. However, the only thing it can do, is describe this function using a Karnaugh Map. Anybody who has tried to express an adder without introducing new variables is well aware that that’s a difficult task. So, we need new variables. This is the point where some collaboration I have lately been doing with some researchers (Laurent Simon and George Katsirelos) comes into play. Following their footsteps, I realised we could introduce new definitions, using a new targeting method we developed. So, I did just that, and here is a nice result for a very difficult problem:

```num: 7750 lit1: 741 lit2: 4641 num: 2339 lit1: 1396 lit2: 1670 num: 2172 lit1: -719 lit2: 741 num: 2169 lit1: -719 lit2: 4641 num: 2061 lit1: 1669 lit2: 1670 num: 2010 lit1: 1670 lit2: 1734 num: 1973 lit1: 1670 lit2: 1882 ... time: 2.53 seconds```

Where literals `lit1` and `lit2` are both present in `num` number of clauses. So, for example literals `741` and `4641` are both present in 7750 clauses. Which means that if we introduce a definition with a new variable `10000`:

`10000 = 741 OR 4641`

(which requires 2 binary clauses and one 3-long clause) we can remove 7750 literals from the problem, while increasing the propagation potential. For example, if there was a clause

`741 OR 4641 OR 2 OR 3 = TRUE`

we can make it into

`10000 OR 2 OR 3 = TRUE`

What I have just described is a (dumb) version of extended resolution, the holy grail that many have tried and only few have succeeded at. It is a holy grail because it is provably more powerful than normal resolution, but heuristics of when and how to introduce the new variables have been difficult to come up with. What I have just described at least would reduce the problem size: removing e.g. ~7700 literals while only introducing 3 short clauses seems to worth the effort. It might not simulate extended resolution (ER), but it should speed up the solving, while giving a (lowly) shot at ER.

PS: If all of this sounds mad, that may be attributed to the ~500 people around me loudly developing and discussing issues ranging from peer-to-peer routing to FPGA programming

# Transitive OTF self-subsuming resolution

The title may be a bit long, but its essence is very simple: we try to shorten learnt clauses. The basic idea was described in this post: there is a clause we just derived, e.g.

`d V -e V f V g` (1)

where `d,e,f,g` are binary variables, `-` is binary negation, and `V` is the binary OR operator. We can remove a literal from this using self-subsuming resolution with e.g. the 2-long clause:

`f V -g` (2)

removing `g` from clause (1). This has been achieved before using on-the-fly self-subsuming resolution. The trick we add now is the following. Let’s assume that clause (2) was not in the clause database. With the above technique, `g` would not be removed. However, if clauses:

`f V a`
`-a V -g`

are inside the clause database, we could, in fact, remove `g`, since the above two clauses, when we resolve them on `a` become:

`f V -g`

i.e. clause (2), what we have been searching for! So, how could we do this kind of reasoning efficiently? It turns out that this is not so difficult. We simply need to try to propagate `-f` using only the 2-long clauses. Then, we will reach `-g` through the intermediary, `a`.

Naturally, we can do the above recursive-propagation process not only for `f` but for all literals in the original clause (1), and then try to perform on-the-fly self-subsuming resolution, as before. There is only one catch: doing this kind of recursive propagation on all 2-long clauses for all literals in a clause is too time-consuming. So we only do it for clauses that are short: 5 literals or shorter. The results are in, and seem to indicate that transitive on-the-fly self-subsuming resolution with a limit of 5-long clauses is indeed viable:

The set of problems used were those of the SAT Competition 2009, and the time limit was 1500 seconds on some powerful machines — they are approx 2x as fast as those used in the competition. As you can see, transitive OTF self-subsuming resolution seems to pay off in terms of number of problem instances solved within a certain time limit. I have decided to add this feature to the upcoming CryptoMiniSat 2.6.1, which should be ready soon.

# Self-subsuming resolution

Self-subsuming resolution in SAT uses the resolution operator to carry out its magic. Resolution can be used on two clauses if they share a variable, but the sign of the variable is inverted, e.g.
`v1 V v2`
`-v1 V v3 V v4`
(where `v1..v4` are binary variables and “`V`” means binary OR) can be resolved on `v1`, producing the clause:
`v2 V v3 V v4`

In SAT this is used to simplify problems as follows. Let’s assume we have many clauses, among which there are these three:
`a V c (1)`
`a V -c V d V f (2)`
`a V -c V g (3)`
In this case, if we use the resolution operator on (1) and (2), we get:
`a V d V f (4)`
The interesting thing about (4) is that it is a strict subset of (2). In other words, we could simply replace (2) with (4), thus shortening clause (2). If we use the resolution operator on (1) and (3) we get:
`a V g (5)`
whose literals form a strict subset of those of (3), so we can replace (3) with (5), again shortening a clause. We shortened 2 clauses, each with one literal. For completeness, this technique can be applied in a recursive manner on all possible clause-pairs.

Until now, CryptoMiniSat was doing self-subsuming resolution in such a way that the clauses being manipulated were kept inside the propagation queue. The problem was, that the propagation queue enforces a very strict position of literals in the clause. So, when e.g. removing literal “`a`” from clause (3), CryptoMiniSat had to completely remove the clause, and then to completely re-add it, since the position of “`a`” changed (it got removed). The additional overhead for this was simply too much: in certain cases, self-subsuming resolution took 130 seconds to complete, 115 of which was taken by this detach-reattach overhead.

The solution to this problem was to completely remove all clauses from the propagation queue, then do self-subsuming resolution, and finally re-add the clauses. Interestingly, complete removal of all clauses is very fast (essentially, a constant-time operation, even though removing clauses one-by-one is very costly), and completely re-adding them is also very fast (linear in the number of clauses). The first impressions from this technique are very positive, and I decided to release CryptoMiniSat 2.6.0 with this technique included.

NOTE: Thanks to N. SĂ¶rensson for pointing me out that self-subsuming resolution could be done better. I am not sure this is what he meant, but fingers are crossed.