Tag Archives: subsumption

Implicit binary clauses

I have lately been trying to get CryptoMiniSat to use implicit binary clauses. The idea is that since binary clauses are very trivial (just two literals), and they don’t really need to keep state (like clause activity), they don’t really need to be stored at a separate location. Instead, they can be stored directly in the watchlists. There are a number of advantages and disadvantages that come with this approach.

The main advantage is a notable reduction of memory usage and memory fragmentation. The first is obvious: since we don’t allocate space for binary clauses separately, the memory usage of the program should go down. This is especially true since SAT problems usually contain a huge number of binary clauses. The secondary benefit, that of reduced memory fragmentation is not really that much of an advantage if someone uses, e.g. the boost pool library.

The disadvantages are mainly twofold. Firstly, bugs are very difficult to find. Since there is not one central database of binary clauses (as before), it becomes difficult to check the consistency of the watchlists. However, if inconsistencies creep in, then the solution found by the SAT solver could be wrong. Worst of all, consistency is difficult to keep, as binary clauses often need to be worked on by e.g. subsumption, variable elimination, etc. The second biggest disadvatage is that if a new algorithm comes up that needs a database of binary clauses, this database would need to be re-built every time to run that algorithm, which can be very costly in terms of time.

All in all, it took me about 1 day to implement implicit binary clauses, and about 3 days to debug it. Surprisingly, it came with some very positive side-effects. Firstly, the debugging session took out some very long-standing bugs in CryptoMiniSat. Secondly, since binary clauses represent most clauses in typical SAT problems, and since binary clauses cannot be subsumed by anything other than binary clauses, the subsumption algorithm has been notably speeded up and its memory usage has been reduced. The worst part of using implicit binary clauses has been the fact that I can no longer use binary clause-sorting to find binary xor clauses, and I must resort back to Tarjan’s strongly connected component finding algorithm. This algorithm is algorithmically faster (only O(V+E) versus O(n*logn) ), but practically slower, since I need to carry it out repeatedly, and I cannot save state. Furthermore, I haven’t yet coded it down, so I am using boost’s graph algorithm, which means that I now have a dependency in CryptoMiniSat. This will eventually be corrected, but it is already causing some trouble.

In the meantime, I have been parallelising CryptoMiniSat. Funnily enough, it took me about 2 days in total to multi-thread CryptoMiniSat versus the 4 days it took to implement implicit binary clauses. Oh well.

On-the-fly self-subsuming resolution

I have recently been trying a new method of shortening learnt clauses. There is a learnt clause minimisation paper by Sörensson and Biere, and I have recently been trying to do more. The trick I use, is that many literals can be removed from learnt clauses, if self-subsuming resolution (see my older post) is applied to them.

During self-subsuming resolution, under normal circumstances, the clauses used to remove literals with are the short clauses: binary and tertiary clauses. The trick I have discovered, is that since CryptoMiniSat keeps binary and tertiary clauses natively inside the watchlists (see my previous post), so doing self-subsuming resolution can be done extremely fast. We simply need to put the literals of the newly learnt clause into a fast-addressable memory, then go through the watchlists of every literal in the clause, and remove the literals that match.

The code used to achieve this is the following:

for (uint32_t i = 0; i < cl.size(); i++) seen[cl[i].toInt()] = 1;
for (Lit *l = cl.getData(), *end = cl.getDataEnd(); l != end; l++) {
    if (seen[l->toInt()] == 0) continue;

    Lit lit = *l;
    //watched is messed: lit is in watched[~lit]
    vec& ws = watches[(~lit).toInt()];
    for (Watched* i = ws.getData(), *end = ws.getDataEnd(); i != end; i++) {
        if (i->isBinary()) {
            if (seen[(~i->getOtherLit()).toInt()]) {
                seen[(~i->getOtherLit()).toInt()] = 0;
            }
        }
    }
}

Lit *i = cl.getData();
Lit *j= i;
for (Lit* end = cl.getDataEnd(); i != end; i++) {
    if (seen[i->toInt()]) *j++ = *i;
    seen[i->toInt()] = 0;
}
cl.shrink(i-j);

Essentially, we set the seen vector (which is initially all 0), to 1 where the original clause contained a literal. Then, we go through the watchlists, and check if any binary clause could strengthen the clause. If so, we unset the corresponding bit in seen. Finally, we clean the clause from the literals where seen is 0 (and then set all parts of seen back to 0).

The actual implementation only differs from the one above by also using tertiary clauses (which are also natively inside the watchlists) to do self-subsuming resolution. The results are very promising: CryptoMiniSat can now solve 2 more problems from the 2009 SAT Competition instances within the original time limit, and seems to scale better in the longer run, too. An example statistics output of CryptoMiniSat (from the UTI-20-10p1 problem):

c OTF cl watch-shrink   : 961191      (0.64      clauses/conflict)
c OTF cl watch-sh-lit   : 6096451     (6.34       lits/clause)

which means that in 64% of the cases, on-the-fly self-subsuming resolution was useful, and on average, it removed 6.3 literals from each clause where it was useful.

Self-subsuming resolution

Self-subsuming resolution in SAT uses the resolution operator to carry out its magic. Resolution can be used on two clauses if they share a variable, but the sign of the variable is inverted, e.g.
v1 V v2
-v1 V v3 V v4
(where v1..v4 are binary variables and “V” means binary OR) can be resolved on v1, producing the clause:
v2 V v3 V v4

In SAT this is used to simplify problems as follows. Let’s assume we have many clauses, among which there are these three:
a V c (1)
a V -c V d V f (2)
a V -c V g (3)
In this case, if we use the resolution operator on (1) and (2), we get:
a V d V f (4)
The interesting thing about (4) is that it is a strict subset of (2). In other words, we could simply replace (2) with (4), thus shortening clause (2). If we use the resolution operator on (1) and (3) we get:
a V g (5)
whose literals form a strict subset of those of (3), so we can replace (3) with (5), again shortening a clause. We shortened 2 clauses, each with one literal. For completeness, this technique can be applied in a recursive manner on all possible clause-pairs.

Until now, CryptoMiniSat was doing self-subsuming resolution in such a way that the clauses being manipulated were kept inside the propagation queue. The problem was, that the propagation queue enforces a very strict position of literals in the clause. So, when e.g. removing literal “a” from clause (3), CryptoMiniSat had to completely remove the clause, and then to completely re-add it, since the position of “a” changed (it got removed). The additional overhead for this was simply too much: in certain cases, self-subsuming resolution took 130 seconds to complete, 115 of which was taken by this detach-reattach overhead.

The solution to this problem was to completely remove all clauses from the propagation queue, then do self-subsuming resolution, and finally re-add the clauses. Interestingly, complete removal of all clauses is very fast (essentially, a constant-time operation, even though removing clauses one-by-one is very costly), and completely re-adding them is also very fast (linear in the number of clauses). The first impressions from this technique are very positive, and I decided to release CryptoMiniSat 2.6.0 with this technique included.

NOTE: Thanks to N. Sörensson for pointing me out that self-subsuming resolution could be done better. I am not sure this is what he meant, but fingers are crossed.