Creating a SAT solver from scratch | Wonderings of a SAT geek

I have lately been having a small chat with Francois Grieu and it inspired me to think about how difficult it would be to create a new SAT solver from scratch. Let’s take it for granted that the person who embarks on such a difficult task understands the basics well, and wishes to create a (nowadays very popular) Conflict-Driven Clause-Learning (or simply, CDCL) type SAT solver. The main parts of these solvers are:

Clause parser&database to store the clauses
Propagation engine using watch-lists to carry out Boolean Constraint Propagation (BCP)
Conflict generation code, using the first-UIP scheme
Learnt clause activity calculator&deleter
Variable state database, storing activity and polarity
Search restart manager
Pre-processor is recommended — SatELite does the job reasonably well.

None of these are very difficult to do. However, by implementing these in a minimalistic fashion, one will only end up with a reasonably fast, but not a very fast solver. Enhancements, for example, storing binary clauses implicitly in the occurrence lists in the database bring small, but nevertheless important, benefits. Similarly, having specialised watch-lists to handle binary and tertiary clauses further speed up the solving. Conflict clause minimisation, on-the-fly subsumption, and on-the-fly self-subsuming resolution improve conflict generation. Polarity-calculation and -caching comes handy for variable polarities, and dynamic restarts are also useful. Learnt clause activities might need to support both MiniSat-style and Glucose-style activities.

The real challenge, however, comes with bullet number 7: a fast and complete pre- and in-processor is very complex to write. The pre-processing done by SatELite, i.e. self-subsuming resolution, variable elimination and subsumption are just the icing on the cake for a good pre- and in-processor. Doing these in the middle of the solving (i.e. in-processing), using learnt clauses, doing asymmetric branching, binary clause reasoning, variable replacement, blocked clause elimination, failed literal probing, tautology elimination, clause strenghtening with tautological binary clauses, clause cleaning, and a long list of others — that’s where the hard part lies.

With all of these challenges facing SAT solver implementers, is it reasonable to think that anyone can relatively easily participate in SAT Races and SAT Competitions? If one wishes to respond with yes to this question, then there are two paths ahead: either immense amounts of time need to be invested into writing such a solver, or an already written solver has to be taken as a base to build ideas upon. I think the only reasonable choices for a base are: lingeling, MiniSat, PrecoSat and (I hope to think) CryptoMiniSat. None of these solvers are perfect of course, so one has to put up with all the inherent problems in these solvers.

In the past months, I have been writing many blog entries trying to detail the inner workings of SAT solvers — mostly to help others (and in the process, myself) struggling with SAT solvers. Maybe I should be spending more time trying to document the code of CryptoMiniSat, so that the barrier for entry to creating winning SAT solvers could be lowered.