MiniSat in your browser

CLICK HERE to run CyrptoMiniSat in your browser

 

Lately, I have become an LLVM nut. LLVM is amazing, it can do a lot, and, weirdly enough, a system developed from it is probably one of the biggest users of SAT solvers. I’m talking about KLEE, the symbolic virtual execution machine. It takes a source, compiles it with LLVM, then translates the LLVM IR into SMT, runs STP on it, which in turn runs CryptoMiniSat, and voila… the user gets back a free bug report. Or 1’200 bug reports.

Emscripten

Another great use of LLVM is emscripten, which takes the LLVM IR (again) and turns it into javascript. This of course means we can now run SAT solvers in the browser. This is so much fun. Here you can play with it, this is MiniSat “core”, from the golden old days of 2009. Just type in any CNF in DIMACS format and enjoy :) You can now solve the most basic NP-complete problem, right from your browser!

MiniSat in javascript

Please click here to run CyrptoMiniSat in your browser — much better!!

Epiloge

Yes, it’s actually running in your browser. The default CNF describes the HiTag2 cipher, with 10 output bits and 2 help bits given. The instance was generated using the Grain of Salt system. Note that the web page hangs until MiniSat finishes. Try typing in your own CNFs, play as much as you like. If you are patient, it can solve many problems, including reversing modern ciphers, finding SHA-1 collisions (PDF), verifying hardware and software… you just have to be patient enough.

Probably unbeknownst to you, you are using products of SAT solvers for your daily life: CPUs are verified using SAT solver-based techniques, airplane software is formally verified using SAT solvers, FPGA and CPU layouts are optimized using them, and if you are lucky, your car’s safety-critical systems are also verified using formal techniques, which basically means SAT solvers. Actually, even train schedules and public transport schedules are created using SAT solvers — I know for a fact that many trains in Europe are scheduled using these techniques. All these can be done using the very simple problem description, the CNF, you see above — and many of these systems use MiniSat.

How I did it and what could be improved

It’s relatively easy to compile any C or C++-based SAT solver to javascript using emscripten. Here is my github repo, with included HOWTO, for MiniSat. I would love to compile current lingeling, but its license doesn’t seem to allow emscripten to even think about compiling it. I’ll compile CryptoMiniSat later, it’s a bit hairy right now, I’m trying to refactor the code since it’s a mess. Until then, I hope you’ll enjoy playing with MiniSat!

Acknowledgements

A big thanks to my colleagues who pushed me to became an LLVM nut. Also, big thanks to Niklas Eén and Niklas Sörensson for MinSat!

My SAT solver fuzzing setup

Since the SAT solver fuzzing paper by Brummayer, Lonsing and Biere, SAT solver fuzzing has been quite the thing in the SAT community. Many bugs have since been discovered in almost all SAT solvers, including those of this author. In this blog post I will detail my SAT fuzzing setup: what fuzzers I use and how I use them.

My fuzzers

The most obvious fuzzers to use are those present in the paper and accompanying website: CNFFuzz and FuzzSAT. These fuzzers randomly genereate high-quality structured problem instances. The emphasis here is on the word structured. Randomly generated instances, as the authors of the original paper note, do not excercise SAT solvers enough. SAT solvers are written with very specific problem types in mind: problems containing long chains of binary clauses, gates, etc.

Over the year(s) I have found, however, that these fuzzers are not enough. One of the revelations came when Vegard sent me lots of bugreports about CryptoMiniSat failing when Gaussian elimination was enabled. It turned out that although I ran the fuzzer for long hours, I only ran it with the default option, Gaussian turned off. So default options are sometimes not enough and when off-the track, systems can fail.

Vegard then sent me a fuzzer that wasn’t using CNFFuzz or FuzzSAT, but instead his own CNF generator for SHA-1 hash functions. Instances thus generated are not only structured, they are also very picky about the solution: they either have exactly one or zero. This helped fuzz solution reconstruction, which after blocked clause elimination may not be exactly easy. In fact, it’s so good at fuzzing that it finds a bug I have never been able to fix in CryptoMiniSat, to this day, about binary clauses. Whenever I block them, I cannot reconstruct the solution, ever, unless I disable equivalent literal replacement. The authors of the paper give a pretty complicated proof about why it should work in the presence of equivalent literal replacement, which I still haven’t understood.

When I implemented disconnected component analysis into CryptoMiniSat 3, I knew I had to be careful. In the SAT Competition’11 I failed terribly with it: CryptoMiniSat 2 memory-outed and crashed on many instances with disconnected components. The solution I came up with was to make a program that could concatenate two or more problems into one problem, renumbering the variables. This meta-fuzzer (that used fuzzers as inputs) was then used to fuzz the disconnected component handling system. Although this sounds sufficient, it is in fact far from it. Firstly, the problems generated are disconnected at toplevel — whereas problems in the real world sometimes disconnect later during the search. Secondly, it took 2, maybe 4 CNF inputs, but real problems sometimes have thousands of disconnected components. Both of these issues had to be addressed.

One of the most difficult parts of modern SAT solvers is time-outs of complicated algorithms. Time-outs sometimes make code quite convoluted, and thus are a source of bugs. However, most fuzzers generate problems that are too small for any time-outs to occur, thus severely limiting the scope of the fuzzer. So, the fuzzers don’t exercise a part of the system that is fully exercised by real-world problems. To fix this, I have written a short fuzzer script that generates enormous, but simple problems with millions of variables. This helped me fix a lot of issues in timeouts. I advise having a RAM-disk when using it though, as it writes gigabyte-size CNFs.

I also use some other fuzzers, namely one that excercise XOR-manipulation and the sgen4 CNF generator by Ivor Spence (PDF, pages 85-86) that generates problems of tuneable hardness.

My fuzzing system

Fuzz CNF generators are nice, but not enough on their own. A system around them is necessary for the full benefit. Writing one sounds trivial at first, but my system around the fuzzers is about 1000 lines of python code, so maybe it’s not that trivial after all. It has to do a number of things to be truly useful.

First, it needs to test the solutions given by the solver. If the solution is SAT, it needs to read the (gzipped) CNF file, and check the (partial) solution. Second, if the solution is UNSAT, it either needs to call the DRUP checker to verify the proof, and/or it needs to call another solver to see if it finds a solution or not — and verify that the other solver’s solution is correct. I actually run both DRUP and the other solver, you can never be sure enough. Note that solution verification also complicates time-limiting: we don’t want to wait 30 minutes because a fuzz CNF is too complicated, and we don’t want to indicate an error just because the problem is hard.

Secondly, fuzzing is not only about generating CNFs that trigger bugs, but also about saving these problematic cases for later, as a form of regression tests. These regression tests need to be ran regularly, so that bugs that came up before don’t come up again. I have a large set of regression tests that I have accumulated over the years. These help me increase confidence that when I improve the solver, I don’t accidentally re-introduce bugs. My fuzzer system helps manage and run these regression tests.

Finally, we don’t just want to fuzz the solver through the CNF interface. SAT solvers typically expose a library interface where one can add clauses, solve (with assumptions) and check the conflict returned in terms of the assumptions given. For now, CryptoMiniSat’s library is fuzzed through a specialized CNF format, where “c Solver::solve()” lines are interspersed with the regular clauses. These lines indicate for the solver to call the solve() method and output the solution to a file. This file is then read by the fuzzing system and verified by copying parts of the file into another CNF and using the DRUP-checker and another SAT solver to verify the results. This simulates library usage without assumptions. Thanks to Lukas Prokop who alerted me to bugs in assumptions usage, assumptions-based solving is also simulated through “c Solver::solve( LIT LIT LIT…)” calls, where LITs are replaced with a random literals.

Future work

I want to do two things related to testing and fuzzing. First, I want to improve the checker so that it checks for more problems. I want to check the conflict returned when assumptions are used. Further, I want to check timeouts — this will require some coordination between the fuzzer system and the solver where times are printed that are later checked for correctness. Some optimisations must never take more than N seconds on a fast enough machine, and this could be checked relatively easily.

Second, I want to move CryptoMiniSat into a test-driven system. This will require some quite extensive code re-writing so that test-harnesses will be easier to write. However, it will help fix one issue that hasn’t been talked about, kind of like the elephant in the room: performance bugs. They are quite prevalent, and cannot be caught by simple fuzzing. They plague many SAT solvers, I’m sure, as I keep on finding them at every corner. They are the silent bugs that prevent solvers to be more performant, yet they don’t diminish the resulting solution — only the time takes to find it.

Why CryptoMiniSat 3.3 doesn’t have XORs

I’ve had a number of inquiries about why CryptoMiniSat 3.3 doesn’t have native XOR clause support. Let me explain what I think the benefits and the drawbacks are of having native XOR support and why I made the choice of removing them.

Benefits

The most trivial benefit is that users don’t have to translate XORs into plain CNF. Although transforming is rather easy, some prefer not to go through the hassle. The principle is to cut the XOR into smaller XORs by introducing new variables and then representing the cut-down XORs by banning all the wrong combinations. For example the the XOR x+y+z+w=0 is cut into two XORs x+y+r=0, z+w+r=0 and then x+y+r=0 is represented as x \vee y \vee r, x \vee \neg y \vee \neg r, \ldots.

A more substantial benefit is that propagation and conflict generation can be done natively. This is quite advantageous as it means that far less clauses and variables need to be visited while propagating and generating the UIP conflict. Less variables and clauses visited means less cache-misses and due to the more compact representation, less memory overhead and thus even better cache usage. (For the aficionados: an interesting side-effect of XORs is that blocked literals cannot be used for XOR clauses).

The biggest benefit of having native XORs is that Gaussian elimination can be carried out at every level of the search tree, not just at top-level. Although most people I have talked with still think that CryptoMiniSat 2 only performs Gaussian elimination only at top-level, this is not the case. Performing Gauss at top-level is a two-day hack. Performing it at every level took me about half a year to implement (and about 5K lines of code). It requires leaving traces of past computations and computing reasons for propagations and conflicts indicated by the Gaussian elimination algorithm. According to my highly unofficial count, exactly 2 people use Gaussian elimination at deeper than top-level, Vegard Nossum and Kuldeep S Meel. Out of these two users, only the latter have indicated speedups. It speeds up crypto problems, but not many others, so it’s completely disabled by default.

Drawbacks

The main drawback of having native XOR clauses along normal CNF clauses is the loss of simplicity and universality. Let me explain in detail.

Loss of simplicity. Once a variable present in both a normal and an XOR clause, it cannot be eliminated in a simple way. I don’t even want to try to express the clauses that would result from doing varelim in such cases — it’s probably horribly convoluted. All algorithms in the code now need to check for XORs. On-the-fly clause strengthening cannot strengthen an XOR with a normal clause, of course. Clause cleaning has to take into account that when a 3-long XOR is shortened to a 2-long one, it indicates a new equivalent literal and that may lead to immediate UNSAT. I could go on, but the list is long.

Another problem is that XORs make the state more complex. Until now there were only clauses of the form a \vee \neg b \ldots, sometimes specially stored such as binary clauses. The state of the solver goes through many transformations — think of them as llvm passes — while it is in-processed and solved. The more complex the state, the larger and the more complex the code. While a simple clause-cleaning algorithm can be expressed in about 20 lines of code without implicit binary&tertiary clauses, the same algo blows up to about 100 lines of code with those optimizations. Once you add XOR clauses, it jumps to around double that. We just made our code 10x larger. Imagine adding native support for, e.g. at-most-n.

Loss of universality. If a variable is only XOR clauses, it can now be eliminated at the XOR level. A pure variable at the XOR level indicates an XOR clause that can be removed and later always satisfied. XOR subsumption is when an XOR subsumes the other: the larger can be removed and their XOR added in. All of these are implemented in CMS 2. Probably there are many others out there, though. We could try to implement them all, but this is a whole new universe that may open up the gates of hell. Worst of all, most of these techniques on XORs are probably already simulated by in-processing techniques at the CNF level. Pure XOR variables are simulated by blocked clause elimination. XOR-based variable elimination doesn’t make practical sense at the CNF level, but it could be simulated by normal variable elimination if ignoring heuristics. I could go on.

What CryptoMiniSat 3.3 does

The compromise I came up with for CryptoMiniSat 3.0 is that I regularly search for XOR clauses at the top-level, perform top-level Gaussian elimination, and add the resulting new information to the CNF. Binary and unitary XOR clauses are the simplest to add back.

In the SAT’11 application benchmark set, out of 300 problems, 256 problems contain XOR clauses. the number of XOR clauses found, when they are found, is on average 2905 with an average size of 4.2 literals. Using Gaussian elimination, counting only when something was extracted, the average number of literal equivalences (binary XORs) extracted is 130, while the number of unitary clauses extracted is 0.15.

Here is a typical example output:

c Reading file 'valves-gates-1-k617-unsat.shuffled-as.sat03-412.cnf.gz'
c -- header says num vars:         985042
c -- header says num clauses:     3113540
[.. after about 124K conflicts ..]
c XOR finding  Num XORs: 104091 avg size:  3.0 T: 2.11
c Cut XORs into 2185 block(s) sum vars: 180818 T: 0.11
c Extracted XOR info. Units: 0 Bins: 1275 0-depth-assigns: 0 T: 8.34

Here, about 104K XOR clauses were found (using a sophisticated algorithm) within 2.11s, all of size 3. These XOR clauses were subjected to disconnected component analysis (within the XOR sphere) and they were found to belong to 2185 components. It is important to cut XORs into groups, because Gaussian elimination is an O(n^{\approx 2.5}) algorithm and if we can reduce the n by having more than one matrix, the speed is significantly increased. Next, these 2185 matrices were Gauss-eliminated and were found to contain 1275 binary XOR clauses, which were quickly marked as equivalent literals.

To perform Gaussian elimination at top-level, CryptoMiniSat uses the excellent m4ri library, maintained by my old college and friend Martin Albrecht. It’s a lightweight yet extremely fast Gaussian elimination system using all sorts of nifty tricks to do its job. Unfortunately, for large matrices it can still take a long while as the algorithm itself is not cheap, so I have a cut-off for matrices that are too large.

What a future CryptoMiniSat 3.x could do

Once we have extracted the XORs, we could just as well keep them around during search and perform Gaussian elimination at levels below the top-level. This would bring us close to the old system in terms of real performance — native propagation (which would be unavailable) can only give a 1.5x-2x speedup, but Gaussian elimination at every level can give much more. I would need to clean up a lot of code and then, maybe, this would work. Maybe I’ll do this one day. Though, after spending weeks doing it, probably people will still believe it only does top-level. At least right now, it’s the case.

Conclusions

Implementing a new theory such as XOR deep into the guts of a SAT solver is neither easy nor does it provide a clear advantage in most situations. Those who push for these native theories have either not tried implementing them into a complicated solver such as lingeling/SatELite/CryptoMiniSat/clasp or have already bit the bullet such as the clasp group and I did, and it is probably limiting them in extending the system with new techniques. The resulting larger internal state leads to edge cases, exceptions, and much-much more code.

Handling disconnected components

In CryptoMiniSat 3.2 and above there is a disconnected component finding&solving system. This finds toplevel disconnected components regularly and solves them by launching an instance of the solver itself. If the component is unsatisfiable (UNSAT), the whole system is UNSAT and solving can be immediately aborted. If the component is satisfiable (SAT) the component’s solution is saved and later when printing the solution to the whole problem the component’s satisfying assignment is also printed.

Why deal with disconnected components?

Disconnected components in SAT instances are problematic because the search logic of the solver doesn’t effectively restrict its search to a single component at a time, making its strategy ineffective. Firstly, learnt clause cleaning and variable activity heuristics behave worse. Secondly, solutions to components are thrown away when the search restarts when another component’s solution is too difficult to find. Note that some solvers, e.g. Marijn Heule’s march, do restrict their search to one component at a time, though I personally don’t know of any other.

Disconnected components seem to be quite prevalent. In the SAT’11 competition‘s set of 300 Application instances, 53 contained disconnected components at top-level at the beginning of the search. On average the number of disconnected components for these 53 instances were 393(!). When doing top-level analysis on a regular basis during search, the number of problems containing disconnected components rises to 58 and the average number of components found decreases to 294.

Finding disconnected components

The algorithm I use to find disconnected components is relatively easy. It essentially performs the following: it initializes a set of datastructures that hold information about the found component(s) and then feeds each clause one-by-one to a function that creates new component(s) and/or merges existing components. The datastructures needed are:

  • Array ‘varToComp’ that indicates which variable is in which component. If a variable is in no component yet, it has special value MAX
  • A map of arrays ‘compToVar’ indexed by the component. For example, compToVar[10] points to the array of variables in component number 10
  • Variable ‘nextComp’, an ever-incrementing index indicating the maximum component number used
  • Variable ‘realComps’ indicating the actual number of components — some components counted by ‘nextComp’ may have been merged, and are now empty

The algorithm, in pseudo-python is:

nextComp = 0;
realComps = 0;
varToComp = [MAXVAR] * numVars();
compToVar = {};

for clause in clauses:
  handleClause(clause);

def handleClause(clause) :
  newComp = []
  compsToMerge = set();
  
  for lit in clause:
    var = lit.var();
    if varToComp[var] != MAX :
      comp = varToComp[var];
      compsToMerge.insert(comp);
    else:
      newComp.append(var);

 
    #no components to merge, just put all of them into one component
    if len(compsToMerge) == 1 :
      comp = compsToMerge.pop();
      for var in newComp:
        varToComp[var] = comp
        compsToVar[comp].append(var);

      return;
  
    #delete components to merge and put their variables into newComp
    for comp in compsToMerge:
      vars = compToVar[comp];
      for var in vars:
        newComp.append(var);

      compToVar.erase(comp);
      realComps--;
    
    #mark all variables in newComp belonging to 'nextComp'
    for var in newComp:
      varToComp[var] = nextComp;
    
    compToVar[nextComp] = newComp;
    nextComp++;
    realComps++;

There are ways to make this algorithm faster. A trivial one is to remove the ‘set()’ and use an array initialized to 0 and mark components already seen. A fixed-sized array of the size of variables will do the job. Similarly, the use of ‘nextComp’ forces us to use a map for ‘compToVar’ which is slow — a smarter algorithm will use a round-robin system with an array of arrays. We can also avoid having too many small components at the beginning by calling the handleClause() function with the longer clauses first. It doesn’t give any guarantees, but in practice works quite well.

Finally, we need to add approximate time measurements so that we could exit early in case the algorithm takes too much time. According to my measurements, on the SAT’11 Application instances it took only 0.186s on average to find all components, and in only 34 cases out of 1186 did it take longer than the timeout — which was set to around 7s. At the same time, the maximum time ever spent on finding components was 7.8s. In other words, it is almost free to execute the algorithm.

Handling disconnected components

CryptoMiniSat handles found components in the following way. First, it counts the number of components found. If there is only one component it simply continues solving it. If there are more than one, it orders them according to the number of variables inside and picks the smallest one, solves it with a sub-solver, and checks it solution. If it’s UNSAT, it exits with UNSAT, otherwise it saves the solution for later and picks the next smallest component until there is only one left. When there is only one left, it continues its job with this, largest, component.

The sub-solver CryptoMiniSat launches is a bit special. First of all, it renumbers the variables so if variables 100..120 are in a component, the sub-solver will be launched with variables 0..20. This saves memory in the sub-solver but it means variables must be back-numbered when extracting solutions. Since the subsolvers themselves also internally re-number variables for further speedup, this can get a little bit complicated.

Further, if the sub-solver is handling an extremely small problem (<50 vars) most of its internal functionality is turned off to reduce build-up and tear-down speed. This is important, because in case there are 5000 components we need to go through, even a 0.01s build-up&tear-down time is unacceptable. Finding 5000 components, by the way, is not some wild imaginary number: it happens with the instance transport-[...]10packages-2008seed.040 from SAT'11. This instance actually contains ~5500 components at beginning of the search at toplevel, with an average of 8 variables each.

Demo

Let’s see an example output from CryptoMiniSat:

c Reading file 'traffic_3b_unknown.cnf.gz'
[..]
c -- clauses added: [..] 533919 irredundant 
c -- vars added      39151
[..]
c Found components: 23 BP: 8.85M time: 0.06 s
c large component     0 size:        833
c large component     1 size:       1071
c large component     2 size:       4879
c large component     4 size:        357
c large component     5 size:      14994
c large component     6 size:        476
c large component     7 size:        952
c large component     9 size:        952
c large component    10 size:        595
c large component    11 size:      10234
c large component    16 size:        476
c large component    17 size:        476
c large component    19 size:        476
c Not printed total small (<300 vars) components:10 vars: 2380

In total 23 components were found, the largest component containing 15K variables. The other 22 components contain 10K to as little as less-than 300 variables. Notice that finding the components was essentially free at 0.06s (on a i7-3612QM).

Another typical output is:

c Reading file 'transport-transport-city-sequential-25nodes-1000size-3degree-100mindistance-3trucks-10packages-2008seed.060-SAT.cnf.gz'
[...]
c -- clauses added [..] 3869060 irredundant
c -- vars added     723130
[...]
c Found component(s): 2779 BP: 253.02M time: 0.58 s
c large component  2778 size:     657548
c Not printed total small (<300 vars) components:2778 vars: 25002
c Coming back to original instance, solved 2778 component(s), 25002 vars T: 1.46

In this case there were 2.8K small disconnected components with (on average) 9 variables each within a problem that originally contained 723K variables and 3.9M clauses. The components were found in 0.58s and all but the largest were solved within 1.46s. This means that 2.8K CryptoMiniSat sub-solvers were launched, initialized, ran and destroyed within a span of 1.46s. The largest component with 658K variables is what's left to the solver to deal with. The rest have been removed and their solutions saved.

Conculsions

It's fun, useful, and relatively easy to find disconnected components in SAT problems. They appear at the beginning of the search for some problems, and for some more, during search. Although it is not a game-changer, it is a small drop of water in a cup that gets mostly filled with small drops anyway.

People doing problem modeling for a living are probably crying in horror and blame the modeler who created an instance with thousands of disconnected components. As someone who deals with CNFs though, I don't have the liberty of blaming anyone for their instances and have to deal with what comes my way.

CryptoMiniSat 3.3 released

This is just a short note that CryptoMiniSat 3.3 has been released. This is mainly a bugfix release. I have managed to make a fuzzer that simulates library usage of the system by interspersing the CNF with “c Solver::solve()” calls and then checking the intermediate solutions. Checking is either performed by verifying that all clauses are satisfied, or if the solution is UNSAT, by using a standard SAT solver (I use lingeling). This turned up a surprising number of bugs. I have also fixed the web-based statistics printing which should now work without a hitch.

The non-bugfix part of the release is that CryptoMiniSat now saves memory by dynamically shortening and reallocating the watchlist pointer array and some other data structures. This can save quite a bit of memory on large, sparse instances.