Category Archives: Tools

Tools developed by me

Towards CryptoMiniSat 5.0

I have worked a lot on CryptoMiniSat 5.0 in the past months so I thought I’d write a little bit about what I spent my time on.

Amazon AWS

I have put lots of effort into use Amazon AWS service to run CMS. This is necessary in order to compete at the SAT competition where my competitors have access to massive resources, some to clusters having over 20k CPU cores. Competing against that with a 4-core machine like I did last year will simply not cut it.

The system I built has a client-server infrastructure where the server is a very-very small machine (t1.micro) that hands out jobs to very-very beefy client machine(s) (c4.8xlarge with 18 real cores). I need this architecture because the client I use is a so-called spot instance so Amazon can shut it down any time. The server makes sure to keep in mind what has been solved and what needs to be solved next to complete the job. At the finish of the job, both the server and the client shut down. I simply need to issue, e.g. “./launch_server.py –git 82c4e5adce –s3folder newrun –cnfdir satcomp091113 -t 5000” and it will launch the full SAT competition 09+11+13 instances with a 5000s timeout using a specific GIT revision of CryptoMiniSat. When it finishes (in about 4-5 hours), it (should) send me a mail with the command line to use to download all the data from Amazon S3. It’s neat, fast, and literally just one command line to use.

As for how much I have used it, I have spent over $100 on running costs on AWS in the past 2 months. A run like the one above costs about $2. Not super-cheap, but not the end of the world, either.

Testing and continuous integration

I have TravisCI, Coverity, and Coverall integration. These provide continious integration testing, static analysis, and code coverage analysis, respectively. I find TravisCI to be immensely valuable, I would have trouble not having it for a new project. Coverity is also pretty useful, it has actually found some pretty stupid mistakes I have made. Finally, coveralls has a terrible interface but I like the idea of having test code coverage analysis and it encourages me to put more effort into that. For example, it highlights pretty well the areas that I typically break when coding without realizing it. TravisCI usually warns me if there is something bad except when there is no (or too little) coverage. I am also looking into Docker, which would allow for continuous delivery.

Checking against SWDiA5BY

I have integrated the main idea of SWDiA5BY A26 code into CryptoMiniSat. Further, I am in the process of integrating one of thepatches available on the author’s website. I find these patches to be really interesting and using SWDiA5BY A26 as a check against my own system has allowed me to get rid of a lot of bugs. So, I am greatly indebted to the authors of MiniSat, Glucose and SWDiA5BY.

Conclusions

In the past months I have put a lot of effort into cleaning up, fixing, and taking control of CryptoMiniSat in general. There have been over 240 issues filed at github against CryptoMiniSat over the years, and only 7 are currently open. This is a testament to how open and dynamic the solver development is. In case you are interested in helping to develop or have new ideas, don’t hesitate to contact me. Further, if you have any commercial interest in the solver, don’t hesitate to contact me.

CryptoMiniSat 4 released

[wpdm_file id=1]
CryptoMiniSat 4 is now available for download. This version brings a number of substantial improvements and picks up speed to be as good as the best solvers out there. It now has a much improved library interface as well as a simple but powerful python interface.

SAT Competition 2014

This release is made ahead of the SAT competition 2014 deadlines so anybody can compete and actually have a chance to win. Unfortunately, the way I see it, it’s not possible to use newer versions of lingeling or riss (see license for for details), MiniSat is rather old and glucose doesn’t have new simplification techniques. If you feel the same way, and you rather not write 30K LoC of code, you might enjoy playing with CryptoMiniSat v4 and submitting it to the competition. You can change as much as you like, it’s LGPLv2 — just don’t call it CryptoMiniSat.

Improvements and techniques

Here is a non-exhaustive list of techniques used in CryptoMiniSat v4:

  • Variable elimination and replacement, strengthening, subsumption, vivification
  • On-the-fly stamping, literal caching, hyper-binary resolution and transitive reduction during failed literal probing
  • Bounded variable addition with hack to allow 2-literal diff
  • DRUP-based unsatisfiable proof logging
  • Gate-based clause shortening and removal
  • XOR recovery and manipulation (NOTE: uses the M4RI library that is GPL, if you want LGPL, compile without it)
  • Precise time- and memory tracking. No time or memory-outs on weird CNFs
  • Precise usefulness tracking of all clauses
  • Clause usefulness-based redundant clause removal. Glues are not used by default, but glues are tracked and can be used (command line option)
  • Variable renumbering and variable number hiding. Thanks to this, XOR clauses are cut and the added variables are transparent to the user.
  • SQL-based data logging and AJAX-based powerful data display
  • And of course many-many more

All of the above are implemented as inprocessing techniques. I do not believe in preprocessing and the solver does not in fact use preprocessing at all — it immediately starts to solve instead. This, as everything else, is configurable and you can change it by passing '--presimp 1' as a command-line option. There are a total of 120 command-line options so you can tune the solver as you like.

Python interface

It’s intuitive and fun to use:

You can even have assumptions:

All the power of the SAT solver in a very accessible manner. XOR clauses are trivial, too:

Where the second argument is the right hand side (RHS) of the equation v1 XOR v2 = False.

C++ interface

Usage is pretty simple, and the header files have been significantly cleaned up:

Some suggestions where you can improve the solver to compete

Here is a non-exhaustive list of things that you can improve to win at the competition:

  • Add your own weird idea. You can add new variables if you like, use the occurrence lists already built, and take advantage of all the datastructures (such as stamps, literal cache) already present.
  • Tune the parameters. I only have exactly one i7-4770 to tune the parameters. You might have more. All parameters are accessible from command line, so tuning should be trivial.
  • Use glues to clean clauses. Or use a combination of glues and usefulness metrics. All the metrics are at your fingertips.
  • Make bounded variable addition work for learnt clauses. I could never figure this one out.
  • Improve the ordering of variable elimination. Makes a huge difference.
  • Try a different approach: I use the ‘heavy’ approach where I don’t remove all clauses that I can as I like strong propagation properties. You might try the ‘light’ approach where everything is removed if possible. Just set variable elimination to 100% and add blocked clause elimination. It might work.

For example, below is the code that calculates which clause should be cleaned or kept. You can clearly see how easily this can be changed using the data elements below:

If you were thinking about submitting your weird hack to the MiniSat hacktrack, think about doing the same to CrytoMiniSat v4. You might actually win the real competition. You can change as much as you like.

I will submit a description of CryptoMiniSat v4, your description can simply say that it’s the same except for xyz that you changed. The point of the descriptions is so that people can read what you did and why and then comprehend the results in that light. Just explain carefully what you did and why, and you should be fine.

Thanks

Many-many thanks to Martin Maurer who has submitted over 100 bug reports through the GitHub issue system. Kudos to all who have helped me use, debug and improve the solver. To name just a few: Vegard Nossum, Martin Albrecht, Karsten Nohl, Luca Melette, Vijay Ganesh and Robert Aston.

My SAT solver fuzzing setup

Since the SAT solver fuzzing paper by Brummayer, Lonsing and Biere, SAT solver fuzzing has been quite the thing in the SAT community. Many bugs have since been discovered in almost all SAT solvers, including those of this author. In this blog post I will detail my SAT fuzzing setup: what fuzzers I use and how I use them.

My fuzzers

The most obvious fuzzers to use are those present in the paper and accompanying website: CNFFuzz and FuzzSAT. These fuzzers randomly genereate high-quality structured problem instances. The emphasis here is on the word structured. Randomly generated instances, as the authors of the original paper note, do not excercise SAT solvers enough. SAT solvers are written with very specific problem types in mind: problems containing long chains of binary clauses, gates, etc.

Over the year(s) I have found, however, that these fuzzers are not enough. One of the revelations came when Vegard sent me lots of bugreports about CryptoMiniSat failing when Gaussian elimination was enabled. It turned out that although I ran the fuzzer for long hours, I only ran it with the default option, Gaussian turned off. So default options are sometimes not enough and when off-the track, systems can fail.

Vegard then sent me a fuzzer that wasn’t using CNFFuzz or FuzzSAT, but instead his own CNF generator for SHA-1 hash functions. Instances thus generated are not only structured, they are also very picky about the solution: they either have exactly one or zero. This helped fuzz solution reconstruction, which after blocked clause elimination may not be exactly easy. In fact, it’s so good at fuzzing that it finds a bug I have never been able to fix in CryptoMiniSat, to this day, about binary clauses. Whenever I block them, I cannot reconstruct the solution, ever, unless I disable equivalent literal replacement. The authors of the paper give a pretty complicated proof about why it should work in the presence of equivalent literal replacement, which I still haven’t understood.

When I implemented disconnected component analysis into CryptoMiniSat 3, I knew I had to be careful. In the SAT Competition’11 I failed terribly with it: CryptoMiniSat 2 memory-outed and crashed on many instances with disconnected components. The solution I came up with was to make a program that could concatenate two or more problems into one problem, renumbering the variables. This meta-fuzzer (that used fuzzers as inputs) was then used to fuzz the disconnected component handling system. Although this sounds sufficient, it is in fact far from it. Firstly, the problems generated are disconnected at toplevel — whereas problems in the real world sometimes disconnect later during the search. Secondly, it took 2, maybe 4 CNF inputs, but real problems sometimes have thousands of disconnected components. Both of these issues had to be addressed.

One of the most difficult parts of modern SAT solvers is time-outs of complicated algorithms. Time-outs sometimes make code quite convoluted, and thus are a source of bugs. However, most fuzzers generate problems that are too small for any time-outs to occur, thus severely limiting the scope of the fuzzer. So, the fuzzers don’t exercise a part of the system that is fully exercised by real-world problems. To fix this, I have written a short fuzzer script that generates enormous, but simple problems with millions of variables. This helped me fix a lot of issues in timeouts. I advise having a RAM-disk when using it though, as it writes gigabyte-size CNFs.

I also use some other fuzzers, namely one that excercise XOR-manipulation and the sgen4 CNF generator by Ivor Spence (PDF, pages 85-86) that generates problems of tuneable hardness.

My fuzzing system

Fuzz CNF generators are nice, but not enough on their own. A system around them is necessary for the full benefit. Writing one sounds trivial at first, but my system around the fuzzers is about 1000 lines of python code, so maybe it’s not that trivial after all. It has to do a number of things to be truly useful.

First, it needs to test the solutions given by the solver. If the solution is SAT, it needs to read the (gzipped) CNF file, and check the (partial) solution. Second, if the solution is UNSAT, it either needs to call the DRUP checker to verify the proof, and/or it needs to call another solver to see if it finds a solution or not — and verify that the other solver’s solution is correct. I actually run both DRUP and the other solver, you can never be sure enough. Note that solution verification also complicates time-limiting: we don’t want to wait 30 minutes because a fuzz CNF is too complicated, and we don’t want to indicate an error just because the problem is hard.

Secondly, fuzzing is not only about generating CNFs that trigger bugs, but also about saving these problematic cases for later, as a form of regression tests. These regression tests need to be ran regularly, so that bugs that came up before don’t come up again. I have a large set of regression tests that I have accumulated over the years. These help me increase confidence that when I improve the solver, I don’t accidentally re-introduce bugs. My fuzzer system helps manage and run these regression tests.

Finally, we don’t just want to fuzz the solver through the CNF interface. SAT solvers typically expose a library interface where one can add clauses, solve (with assumptions) and check the conflict returned in terms of the assumptions given. For now, CryptoMiniSat’s library is fuzzed through a specialized CNF format, where “c Solver::solve()” lines are interspersed with the regular clauses. These lines indicate for the solver to call the solve() method and output the solution to a file. This file is then read by the fuzzing system and verified by copying parts of the file into another CNF and using the DRUP-checker and another SAT solver to verify the results. This simulates library usage without assumptions. Thanks to Lukas Prokop who alerted me to bugs in assumptions usage, assumptions-based solving is also simulated through “c Solver::solve( LIT LIT LIT…)” calls, where LITs are replaced with a random literals.

Future work

I want to do two things related to testing and fuzzing. First, I want to improve the checker so that it checks for more problems. I want to check the conflict returned when assumptions are used. Further, I want to check timeouts — this will require some coordination between the fuzzer system and the solver where times are printed that are later checked for correctness. Some optimisations must never take more than N seconds on a fast enough machine, and this could be checked relatively easily.

Second, I want to move CryptoMiniSat into a test-driven system. This will require some quite extensive code re-writing so that test-harnesses will be easier to write. However, it will help fix one issue that hasn’t been talked about, kind of like the elephant in the room: performance bugs. They are quite prevalent, and cannot be caught by simple fuzzing. They plague many SAT solvers, I’m sure, as I keep on finding them at every corner. They are the silent bugs that prevent solvers to be more performant, yet they don’t diminish the resulting solution — only the time takes to find it.