Tag Archives: failed literal probing

Testing CryptoMiniSat using GoogleTest

Lately, I have been working quite hard on writing module tests for CryptoMinisat using GoogleTest. I’d like to share what I’ve learnt and what surprised me most about this exercise.

An example

First of all, let me show how a typical test looks like:

Here we are checking that intree probing finds that the set of three binary clauses cause a failure and it enqueues “-2” at top level. If one looks at it, it’s a fairly trivial test. It turns out that most are in fact, fairly trivial if the system is set up well. This test’s setup is the following test fixture:

Continue reading

Failed literal probing and UIP

I have just realised that CryptoMiniSat, having won a number of medals, does one of the most basic things, failed literal probing, all wrong. Let me tell you why it’s all wrong. In essence, failed literal probing is trivial. We enqueue a fact, such as a, and then propagate it. If this fails, i.e. if two opposing facts such as g and -g are derived, we know that a cannot be true, so we set -a. Easy. Or maybe not so easy.

The devil is in the details, so let’s see how we derived both g and -g from a. Let’s assume that we have the following set of binary clauses:
-a V b
-b V c
-b V d
-d V e
-d V f
-e V g
-f V -g

which, from the point of view of a is best described as the graph:

Propagating "a", deriving both "g" and "-g"

The problem is, that if we only derive -a from this graph, we end up with only that one literal set, because -a doesn’t propagate anything on the clauses. This is quite sad, because, in fact, we could derive something stronger. From the graph it is evident that setting d would have failed: the graph would simply have its upper part cut away completely, but the lower part, including the derivation of both g and -g would still stand:

Deriving both "g" and "-g" from "d"

What is special about node d? Well, it’s where the 1st UIP, the first unique implication point, lies. And it is quite simple to see that -d is in fact the strongest thing we can derive from this graph. It’s much stronger than simply -a, because setting -d propagates on the clauses, giving -b,-a, setting three variables instead of one, including -a. An interesting observation is the following: deriving -b is the 2nd UIP, and deriving -a is the last UIP. In other words, at least in this most simple case, 1st UIP is in fact the strongest, and 2nd, 3rd.. last UIP are less strong in strict order.

Let me remark on one more thing about failed literal probing. Once failed literal probing has been done on literal x and it visited the node set N, there is no need to try to do failed literal probing on any nodes in N, since they cannot possibly fail. Although the failing of a literal can have consequences on the failing of other literals, if we ignore this side-effect, we could speed up failed literal probing by marking literals already visited, and only carrying out failed literal probing on ones that haven’t been marked. This is really trivial, but I haven’t been using it :S

I am quite sure that some advanced SAT solvers (such as lingeling) do all of the above things right. It’s probably only CryptoMiniSat that fails miserably :)

Note: there is a subtle bug with marking literals visited as “OK”. If two different subgraphs could fail, but we abort on the first one, then we might mark a literal OK when in fact it isn’t. It is therefore easiest not to mark any literals if the probe failed (which is very rare).

CCC madness

The Chaos Computer Club (CCC) is having its yearly conference starting today. It’s a madhouse here, which is great for ideas, so I have been having this rush of ideas implemented. First off, it seems that complex problems with few variables are way too difficult to solve even with distributed solving. I have been trying to solve some difficult problems, but no matter how much computer power I throw at them (and I have been throwing >2 years’ worth, with >300 CPU cores running), not much progress is being made. I have lately been attributing this “failure” to one prime problem: lack of expressiveness.

Basically, I feel like the SAT solver is trying to express some complex functions with learnt clauses. For instance, an adder. However, the only thing it can do, is describe this function using a Karnaugh Map. Anybody who has tried to express an adder without introducing new variables is well aware that that’s a difficult task. So, we need new variables. This is the point where some collaboration I have lately been doing with some researchers (Laurent Simon and George Katsirelos) comes into play. Following their footsteps, I realised we could introduce new definitions, using a new targeting method we developed. So, I did just that, and here is a nice result for a very difficult problem:

num: 7750 lit1: 741 lit2: 4641
num: 2339 lit1: 1396 lit2: 1670
num: 2172 lit1: -719 lit2: 741
num: 2169 lit1: -719 lit2: 4641
num: 2061 lit1: 1669 lit2: 1670
num: 2010 lit1: 1670 lit2: 1734
num: 1973 lit1: 1670 lit2: 1882
...
time: 2.53 seconds

Where literals lit1 and lit2 are both present in num number of clauses. So, for example literals 741 and 4641 are both present in 7750 clauses. Which means that if we introduce a definition with a new variable 10000:

10000 = 741 OR 4641

(which requires 2 binary clauses and one 3-long clause) we can remove 7750 literals from the problem, while increasing the propagation potential. For example, if there was a clause

741 OR 4641 OR 2 OR 3 = TRUE

we can make it into

10000 OR 2 OR 3 = TRUE

What I have just described is a (dumb) version of extended resolution, the holy grail that many have tried and only few have succeeded at. It is a holy grail because it is provably more powerful than normal resolution, but heuristics of when and how to introduce the new variables have been difficult to come up with. What I have just described at least would reduce the problem size: removing e.g. ~7700 literals while only introducing 3 short clauses seems to worth the effort. It might not simulate extended resolution (ER), but it should speed up the solving, while giving a (lowly) shot at ER.

PS: If all of this sounds mad, that may be attributed to the ~500 people around me loudly developing and discussing issues ranging from peer-to-peer routing to FPGA programming

On failed literal probing

Apparently, there have been quite some work done on failed literal probing, although I don’t think they have all been put into a common package yet.

The idea is quite simple in its purest form: try to set variable v1 to true and see if that fails. If it does, variable v1 must be false. If it doesn’t fail, try the other way around: set it to false and if that fails, it must be true.

There are a couple of tricks we can add, however. For example, if both v1 and !v1 set variable v10 to value X (where X can be either true or false), then set v10 to X. The reasoning here is simple: v1 must be set, so whatever both v1 and !v1 imply, it must be set, too. So we can safely set v10 to X.

A more interesting thinking is the following. If v1 sets v20 to true, but !v1 sets v20 to false, then v1 = v20. So, we can replace v20 with v1. One less variable to worry about!

There are even more tricks, however. If setting v1 to true and false both shorten a longer XOR to a 2-long XOR “v40 + v50 = false“, this 2-long XOR can be learnt: v40 can be replaced with v50.

And more tricks. If there is a 2-long clause v1 or v2 = true, then we can do all of the above, but with v1 and v2 this time. Since either v1 or v2 must be true, all the above ideas still work. In all previous ideas all we used was the fact that either v1 or !v1 must be true. This is still the case: either v1 or v2 must be true. This, by the way, is called 1-recursive learning. (note: the fourth paragraphs changes a bit, but we still learn a 2-long xor).

And now something new. I have been thinking about trying to re-use old learnt clauses. They really code down sometimes quite important facts about the problem. What if we tried to re-use them? After all, keeping them in memory is really cheap: the only problem with them is that doing regular propagation with them takes a lot of time, thus slowing down the solver. But what if we only used them to do failed literal probing? I just launched a test on the Grid’5000 cluster to find out. Fingers crossed…