Why programs fail

I just bought the book with the title of the blogpost by Andreas Zeller. Essentially, it’s about debugging, where the author analyses the chain of program defect leading to infected program state, finally leading to program failure. I bought the book because while working with CryptoMiniSat, I have encountered so many bugs that they could fill a book.

The problem with chasing bugs in CryptoMiniSat is that SAT solvers employ a very optimised algorithm that makes a lot of decisions per second, so recording all interesting events makes for a very big dump file. I have per-module verbosity settings, so if I can narrow down what module(s) are causing the failure, there is less debug output to go through, but sometimes I still have to wade through 10-20GB dumps to see how the program state got infected.

While going through such large dumps, the following occured to me: what if we loaded all this data into a MySQL database? Some 8 years ago I had a job where I processed a large chunk of the data on all of Hungary’s phone conversations on a daily basis. Essentially, data on your phone calls are recorded in many different databases, and these must be merged to calculate your bill and to provide statistical feedback to the company. I wrote a program that processed 3million records in about 1 hour using MySQL and a number of SQL statements that spanned approx. 2-3000 lines. In other words, I am not afraid of large databases, and I think such multi-gigabyte data could easily be loaded into a MySQL database.

The goal of loading a SAT run into a MySQL database is the following: once the MySQL data is ready, a bit of PHP, graphviz and gnuplot will plot any relevant information at any point in time about the solving. In other words, we could dig this data visually, with some very interesting effects.

For example, I have been lately having problems with clause strengthening interfering with variable elimination. The idea is in theory quite simple. Variable elimination eliminates a variable and adds clauses that represent the original clauses the variable was present in. These clauses are sometimes trivial, such as:
a or b or c or !c = true
which is never added since it is satisfied whether c is true or false. So far, so good. Clause strenghtening is a method whereby clauses are shortened. For example, if there was a clause:
e or b or c = true
the solver might discover that in fact it is also true that:
e or b = true
and replace the original clause with this, since this clause is “stronger”: it poses a more stringent requirement. The way variable elimination and clause strenghtening interact is as follows. Let’s suppose we eliminate on varible e. If the above clause is strenghtened before elimination, then instead of:
a or b or c or !c = true
we might have got:
a or b or !c = true
which must be added. We actually have more clauses because of clause strenghtening! I believe nobody ever found this anomaly. To properly quantitise how this affects solving, we need to know all states of a clause, and a web interface with a MySQL backbone could help a lot with that.