# Being dumb is part of the deal

In my last post, I presented a fast and easy way to add binary clauses to the clause database of SAT solvers, and I conjectured that since the algorithm was so simple, there must be something wrong with it. I was, in fact, right. I ran a cluster experiment, and it turned out that adding all those binary clauses was slowing down the solving.

My first reaction was of course: how on Earth? Binary clauses are “good”: both MiniSat and Glucose try to amass these clauses as much as possible. So what can possibly be wrong? Well, redundant information, for a start. Let’s assume that we know that

```A leads to B

If I now tell you that

`A leads to C`

it wouldn’t really help you. However, the algorithm I presented last time didn’t address this issue. It simply added a good number (though, luckily, not all possible) binary clauses that were totally redundant.

The trick of adding binary clauses is, then, to add as little as possible to achieve maximal graph connectivity. And, while we are at it, we can of course remove all redundant information like the one I presented above. Neither of these two are easy. I use the Monte-Carlo method to approximate the degree of vertexes for the first algorithm, and then sort vertexes according to their degrees when adding binary clauses. The second algorithm just needs a bit of imagination and a will to draw graphs on a piece of paper.

So, there is a point in being dumb: without the hands-on knowledge how binary clauses negatively affect problem solving, I would never have thought of developing such algorithms. The final algorithms take ~3-4 seconds to complete, add (fairly close to) the minimum number of clauses, remove all redundant clauses, and work like a charm.

There is only one thing to understand now: why does Armin Biere add hundreds of thousands of binary clauses? I seem to be able to achieve the same effect in a couple of thousand… Again, I must be dumb, and he must be doing something right. I am lost, again :(

EDITED TO ADD: You might enjoy this plot of PrecoSat236 vs. CryptoMiniSat 2.5.0beta1 on the SAT Competition instances of 2009.

# I must be dumb

Apparently, hyper-binary resolution is not easy. However, I have been reading the paper by Gershman and Strichman, and they make it sound like it’s the easiest thing of all — especially if you don’t do it the way they are doing it, of course.

Hyper-binary resolution is essentially very easy. Let’s say you have three clauses:

```-a V b
-a V c
d V -b V -c
```

(where `V` is the binary OR, `-` is the binary negation, and each line must be satisfied)

If we set `a=true` and propagate this, then we get to the result that `d=true`. Which can be coded down as the clause:

`-a V d`

More interestingly, this new clause actually helps: if you only use the original 3 clauses, and you set `d = false` and propagate, nothing happens. However, if you add the new clause to the clause list, then `a=false` propagates automatically.

The way these kind of new binary clauses (like `-a V d`) could be found has been done before as follows:

1. Build graph from binary clauses: literals are vertices and there are edges between vertices `a` and `b` if there is a binary clause `-a V b`
2. Set `a=true` and propagate
3. Check if every propagated fact can be reached starting from the vertex corresponding to `a` in the graph. If it cannot be reached, add the binary clause that represents the edge.

One can say that this algorithm works perfectly. But I am a lazy coder, and I don’t want to implement a graph library into CryptoMiniSat. What’s there to do then? Well, we can achieve the same thing with data structures already present in all modern solvers!

The trick to use is that binary clauses are propagated in a very different manner than normal clauses: they have their own watchlists, and propagating binary clauses is extremely fast. So, instead of trying to fiddle with some overoptimised graph library, we can simply write a routine, called propagateBin(), which only propagates binary clauses already inside the code. We then compare the propagate() and the results of propagateBin() and if something is missing, we add it.

In fact, implementing this algorithm using the above trick is trivial. No use of graph libraries, nothing to do really, just write a 20-line propagateBin(), which itself is just a copy-paste of the first couple of lines of propagate() . A very short side-note: it’s also very fast. We can find 100’000 binary clauses in less than 0.1 seconds. Here is the commit that implements this in CrptoMiniSat. I think it makes sense even for those unfamiliar with the code.

Finally, why am I dumb? Because there must be something amiss here. I can’t believe that people much more knowledgeable in the subject didn’t come up with this idea before. So, they must have, and probably already published it, or, the idea must be really bad, missing elements or is incomplete. If it is complete, however, then hyper-binary resolution is once and for all fast and efficient. Can’t wait for someone to prove me wrong.

# Blocked clause elimination implies dependent variable removal

While writing the description of CryptoMiniSat for the 2010 SAT Race, I have realised how powerful the blocked clause elimination (BCE) technique by Matti Jarvisalo and Armin Biere is. For xor clauses, which represent a simple XOR like

`v1 + v2 + v3 = true`

there is a technique called dependent variable removal by Heule and Maaren. This technique removes an xor clause if one of its variables appears only in one place: in that xor clause. The idea is that since that variable only appears there, the XOR can always be satisfied. For example, if variable `v1` is dependent (i.e. it only appears in this XOR), and

`v2 = true, v3 = true`

then this xor-clause can simply be satisfied by giving `v1 = true`:

`true + true + true = true`

So, variable `v1` can be removed from the problem along with the xor clause, and the value of variable `v1` needs only to be calculated after the solving has finished.

The great thing about blocked clause elimination, is that it can achieve this automatically, without the use of xor-clauses! Let us convert our xor-clause into regular clauses:

`v1 v2 v3 (1)`
`v1 -v2 -v3 (2)`
`-v1 v2 -v3 (3)`
`-v1 -v2 v3 (4)`

Now, let us try to block these clauses on variable `v1`. Resolving (1) with (3) gives tautology since the result of the resolution,

`v2 v3 v2 -v3`

has both a literal and its negation (`v3` and `-v3`) in it. The same is true for resolving (1) and (4), this time with v2 and -v2. The same happens to (2)&(3) and (2)&(4), too, eventually removing all clauses.

So, can blocked clause elimination replace dependent variable removal? I am not sure. Dependent variable removal can be used in conjunction with other methods that treat xor caluses, that can lead to potentially very long XOR chains. These long XOR chains are not well handled through regular clauses, and so they are not created by solvers that don’t handle xor clauses natively — creating them is avoided through heuristic cut-offs. Thus, dependent variable removal can potentially remove more of these XOR-s than blocked clause elimination. But blocked clause elimination can remove clauses that dependent variable elimination cannot. So, I believe there is point in using both.