# Memory layout of clauses in MiniSat

I have been trying to debug why some MiniSat-based solvers perform better at unit propagation than CryptoMiniSat. It took me exactly 3 full days to find out and I’d like to share this tidbit of information with everyone, as I think it might be of interest and I hardly believe many understand it.

The mystery of the faster unit propagation was that I tried my best to make my solver behave exactly as MiniSat to debug the issue, but even though it was behaving almost identically, it was still slower. It made no sense and I had to dig deeper. You have to understand that both solvers use their own memory managers. In fact, CryptoMiniSat had a memory manager before MiniSat. Memory managers are used so that the clauses are put close to one another so there is a chance that they are in the same memory page, or even better, close enough for them not to waste memory in the cache. This means that a contiguous memory space is reserved where the clauses are placed.

# Faster cleaning of the learnt clause database

In SAT solvers, removing unneeded learnt clauses from the clause database sounds like a trivial task: we somehow determine which clauses are not needed and we call remove() on them. However, in case performance is an issue, it can get a bit more more complicated.

### The problem

The issue at hand is that clauses are stored in two places: as a list of pointers to the clauses and in a list of lists called the watchlist. Removing clauses from either list can be an $O(n^2)$ operation if we e.g. remove every element from the list, one by one. In fact, an old version of the most popular SAT solver, MiniSat2, used to do exactly this:

Here, removeClause() is called on each clause individually, where removeClause() eventually calls remove() twice, where remove() is a linear operation:

It is clear that if the number of learnt clauses removed is a significant percent of all clauses (which it is after some runtime), this is an $O(n^2)$ operation.

### My original solution

My original solution to this problem was the following. First, I did a sweep on the watchlist and detached all learnt clauses. This is an $O(n)$ operation. Then, I ran the algorithm above, without the removeClause(). Finally, I attached the remaining learnt clauses: again an $O(n)$ operation. This solution is significantly faster than the MiniSat one as its worst-case runtime is only $O(n)$. The improvement is measurable — worst-case cleaning times dropped from seconds to tenths of seconds. However, it can be further improved.

### The improved solution

The improvement that came to my mind just yesterday was the following. I can keep a one bit marker in each learnt clause that indicates whether the clause needs to be detached or not. Then, I can run the algorithm as above but replace removeClause() with markclause() and run through the watchlists once to remove (and free) the marked clauses. This works really well and it only necessitates one sweep of the watchlists, without any useless detach+reattach cycles.

The newer GitHub version of MiniSat also marks the clauses instead of detaching them immediately and then removes them in one sweep, later. Interestingly, it keeps a list of ‘dirty’ occurrence lists and only goes through the ones that need removal. I find that a bit strange for this specific purpose: usually almost all watchlists are affected. In other cases, though, keeping dirty lists in mind can be a good idea, e.g. if only few clauses are removed for some optimization step.

# Speeding up MiniSat with a one-liner

All SAT solvers must have a tri-state class that can hold the values `true`, `false`, and `undefined`. Although it’s not hard to write code to represent such a class, it’s hard to write such it so that all typical operations are fast. In this blog post I will compare three different ways of doing it.

### The original MiniSat 2.0 code

The original MiniSat 2.0 used the following code for such a class:

All the operators are pretty straightforward and fast, except `operator^` which flips `true` to `false` and `false` to `true` in case `b` is `true`. As you might have noticed it’s using a branch instruction to achieve this result which is weird and slow. Branching is known to be a headache as a wrong branch prediction can slow down the CPU quite a bit. Since `b` is supposed to be random (though it’s not that random), this might slow down the flip operation. Let’s call this version of the class solution (1).

### Newest MiniSat code

Current MiniSat has changed the code to the following:

It doesn’t use a branch operator to flip values: `operator^` uses a simple XOR. However, other operators, notably the comparison operator became much more involved, and thus expensive. Let’s call this solution to the problem solution (2).

### Numerical trickery on the old code

Instead of using really tricky comparison and binary AND/OR operators, let’s try to use the original class, but replace the branch instruction with a bit of numerical trickery:

This makes ‘extensive’ use of the ALU of the chip for the flip operator to avoid using the branch instruction. However, it keeps other operators cheap as per the first version. Let’s call this solution to the problem solution (3).

### Comparison

I have tried to solve two different problems from the SAT Competition of 2009 with all three solutions. First is a typical electronic engineering problem, 9dlx_vliw_at_b_iq2.cnf. The second is a typical cryptographic problem, mizh-md5-47-3.cnf. Here are the timing results for my i7-3612QM (Ivy Bridge):

 Problem Solution(1) Solution(2) Solution(3) 9dlx_vliw_at_b_iq2.cnf 135.9s 138.8s 132.5s mizh-md5-47-3.cnf 60.7s 60.5s 56.3s

So, in this very small test, solution(3) wins.

And now for a bit of history. I have tried the trick in solution(3) about 3 years ago. Back then it seemed slower to perform the numerical trick than to use the branch. It turns out that on modern CPUs either the ALU is faster, branch misprediction is more costly, or both (or, the compilers have evolved to compile my numerical trickery into some weird thing). Anyway, it’s kind of a cool speedup for a one-liner.

# MiniSat in your browser

Lately, I have become an LLVM nut. LLVM is amazing, it can do a lot, and, weirdly enough, a system developed from it is probably one of the biggest users of SAT solvers. I’m talking about KLEE, the symbolic virtual execution machine. It takes a source, compiles it with LLVM, then translates the LLVM IR into SMT, runs STP on it, which in turn runs CryptoMiniSat, and voila… the user gets back a free bug report. Or 1’200 bug reports.

### Emscripten

Another great use of LLVM is emscripten, which takes the LLVM IR (again) and turns it into javascript. This of course means we can now run SAT solvers in the browser. This is so much fun. Here you can play with it, this is MiniSat “core”, from the golden old days of 2009. Just type in any CNF in DIMACS format and enjoy :) You can now solve the most basic NP-complete problem, right from your browser!

### Epiloge

Yes, it’s actually running in your browser. The default CNF describes the HiTag2 cipher, with 10 output bits and 2 help bits given. The instance was generated using the Grain of Salt system. Note that the web page hangs until MiniSat finishes. Try typing in your own CNFs, play as much as you like. If you are patient, it can solve many problems, including reversing modern ciphers, finding SHA-1 collisions (PDF), verifying hardware and software… you just have to be patient enough.

Probably unbeknownst to you, you are using products of SAT solvers for your daily life: CPUs are verified using SAT solver-based techniques, airplane software is formally verified using SAT solvers, FPGA and CPU layouts are optimized using them, and if you are lucky, your car’s safety-critical systems are also verified using formal techniques, which basically means SAT solvers. Actually, even train schedules and public transport schedules are created using SAT solvers — I know for a fact that many trains in Europe are scheduled using these techniques. All these can be done using the very simple problem description, the CNF, you see above — and many of these systems use MiniSat.

### How I did it and what could be improved

It’s relatively easy to compile any C or C++-based SAT solver to javascript using emscripten. Here is my github repo, with included HOWTO, for MiniSat. I would love to compile current lingeling, but its license doesn’t seem to allow emscripten to even think about compiling it. I’ll compile CryptoMiniSat later, it’s a bit hairy right now, I’m trying to refactor the code since it’s a mess. Until then, I hope you’ll enjoy playing with MiniSat!

### Acknowledgements

A big thanks to my colleagues who pushed me to became an LLVM nut. Also, big thanks to Niklas Eén and Niklas Sörensson for MinSat!