SMT Competition’14 and STP

The 2014 SMT competition‘s QF_BV divison is over (benchmark tarball here), and the results are (copied from here):

Solver Errors Solved Not Solved CPU time (solved instances)
Boolector 0 2361 127 138077.59
STP-CryptoMiniSat4 0 2283 205 190660.82
[MathSAT] 0 2199 289 262349.39
[Z3] 0 2180 308 214087.66
CVC4 0 2166 322 87954.62
4Simp 0 2121 367 187966.86
SONOLAR 0 2026 462 174134.49
Yices2 0 1770 718 159991.55
abziz_min_features 9 2155 324 134385.22
abziz_all_features 9 2093 386 122540.04

First, let me congratulate Boolector by Armin Biere. STP came a not-so-close second. STP was meant to be submitted twice, once with MiniSat and once with CyrptoMiniSat4 — this is the only reason why the STP submission is called STP-CryptoMiniSat4. Unfortunately, the other submission could not be made because of some compilation problems.

About the results

There are a couple of things I would like to draw your attention to. First is the cumulated solving time of solved instances. Note how it decreases compared to MathSat and Z3. Note also the differences in the DIFF of the number of solved instances. It’s relatively small between 4Simp…MathSat(<40) and increases dramatically with STP and Boolector (with around 80 each). Another thing of interest is that STP is surrounded by commercial products: Boolector, Z3 and MathSAT are all products you have to pay for to use in a commercial setting. In contrast, the biggest issue with STP is that if the optional(!) CryptoMiniSat4 is linked in, it's LGPLv2 instead of MIT licensed. In other words, a lot of effort is in there, and it's all free. Sometimes free as in beer, sometimes free as in freedom.

About STP

Lately, a lot of work has been done on STP. If you look a the github repository, it has about 80 issues resolved, about 40 open, lots of pull requests, and a lot of commits. A group of people, namely (in nickname order): Dan Liew, Vijay Ganesh, Khoo Yit Phang, Ryan Govostes, Trevor Hansen, a lot of external contributors (e.g. Stephen McCamant) and myself have been working on it pretty intensively. It now has a an automated test facility and a robust build system besides all the code cleanup and other improvements.

It’s a great team and I’m happy to be a minor part of it. If you feel like you could contribute, don’t hesitate to fork the repo, make some changes, and ask for a pull request. The discussions on the pull requests can be pretty elaborate which makes for a nice learning experience for all involved. Come and join — next year hopefully we’ll win :)

Multi-threading and startup speed

When doing multi-threading, parsing the input and inserting it into the solver threads is actually one of the most tricky part. Of course one can simply do

void add_clause_to_all_threads(Clause clause) {
  for(int i = 0; i < threads; i++) {
    solver[i].add_clause(clause);
  }
}

It works and my SAT solver entry to the competition used this method. However, if we have 12 threads, the startup time will be 12x more than usual since only one thread is adding the clauses to all the solvers. If startup used to be 3s, it's suddenly more than half a minute.

A simple solution

A simple solution to this problem is to create threads and add the clauses to the threads individually. The problem with this is that creating 12 threads for each clause and destroying the threads for each is actually more time than to do what's above.

The correct solution

The real solution is then to store the data to be added to the threads, and once a good number of them have been stored, add all the cached clauses to the solvers: create the 12 threads, for each thread add the clauses, and destroy the threads. This solution amortizes the creation&destruction of threads at the expense of storing clauses in a cache.

CryptoMiniSat now uses a clause cache for exactly this purpose. Naturally, this also necessitates a cache for variable creation, though that is only a counter, so not a big deal.

Final polish of my SAT competition entry

[wpdm_file id=2] UPDATE: Fixed crash, UPDATE2: fixed looking for Gaussian elimination library

In the last days of the competition deadline, I have improved the following on the SAT solver:

  • Adding N new variables is now possible in one go. This reduces variable addition overhead, especially when having many threads.
  • Memory allocation overhead per new clause addition is now much smaller thanks to judicious use of globally allocated temporaries. Using such temporaries is very dangerous as CryptoMiniSat uses addClause() from multiple places in the code in a recursive manner. However, I only use the temporaries for add_clause_outer(), so things should be fine.
  • If more than 0.5M variables or 1.5M binary clauses are in the problem, on-the-fly hyper-binary resolution and transitive reduction is turned off during 1st-decision-level search. This is kind of like probing, but during search. However, unlike probing, it cannot time-out and switch off these systems in case they takes too much time.
  • Diversified the threads’ parameters. Also, I reduced the number of threads to 8. I wanted to run with 12, but it’s a bit dangerous from a memory-usage perspective: there is only 24GB available for 12 cores, which means 2GB/core. Some problems take more than 2GB just to parse into the watchlists.
  • Check total memory usage at startup of threads and if too much, halve the total number of threads. This is an emergency measure in case things go wild due to very weird CNF.

Overall these are small changes but allow for a much faster startup. For example a notorious problem, AProVE07-11.cnf, now starts in 2.5s instead of 4.5s. As far as I can tell, this is very similar to the startup time of lingeling on this instance. However, lingeling only uses 1/3rd the memory (~250MB) thanks to a more tight memory manager plus I suspect it doesn’t have a couple of datastructs that I keep around.

The other difference, about OTF hyper-binary resolution, allows for large problems to actually get to the point of solving instead of getting stuck at adding and removing useless binary clauses. Let’s hope all goes well for the competition :)

My SAT Competition 2014 entry

[wpdm_file id=2]

In case you want to download my SAT competition 2014 zipfile, you can do it now. You can port the changes that you added to CryptoMiniSatv4 — it should work out of the box. Nothing really important changed except some timeout fixes, bug fixes, threading, and fixing DRAT.

Unfortunately the SAT Competition’14 setup doesn’t have Boost’s program options, so I had to create a main_simple.cpp that compiles without Boost. It only supports options “–drat=1”, “–threads=N”, “–verbosity=N” and reading plain CNF files (not gzipped). This means you will have to edit SolverConf.cpp in case your changeset involves command line options. The command line options and their corresponding class variables are easy to look up in main.cpp, so changing SolverConf.cpp appropriately should not be difficult.

Good luck with the competition!

Multi-threading

Just a short note: this Easter holiday I got a bit bored and did a mini-threading system for CryptoMiniSatv4. It will now be able to compete in the multi-threading track after quite a number of years of absence. Last time I did this, CryptoMiniSat won the UNSAT multi-threaded track, which was a bit funny because all I shared were unitary and binary clauses. For the simplicity and the fun, I’m doing the same this time around. Let’s hope all the super-complicated and extra-theoretical stuff will get won over.

Non-blocking datastructs

Some authors implement a non-blocking data-sharing construct to gain more performance out of the system. I think/hope that what is shared is more important than how. I use one global lock for the shared unit and one for the shared binary clauses. Since there won’t be more than 8 threads under normal circumstances and I have to lock about once per 10sec per thread for about 0.01s, there won’t be any contention. Under these circumstances, simplicity wins over the complexity of using non-locking datastructures. Also, these fancy datastructs probably don’t have an implementation that is C++11 (i.e. portable) and is already implemented in gcc 4.7.1 (which doesn’t support a number of the threading constructs), the compiler used in the competition.

Library usage

This time around I didn’t mess it up: multi-threading can be used from the library interface. The system completely hides the fact that multiple threads are competing in the background to solve the problem. The threads are cleanly and safely stopped and then re-started when the user asks for another solve() operation, optionally with assumptions. In other words, the user doesn’t see and doesn’t have to bother with the complexity of threads. All of that is abstracted away behind a neat and clean API. This means that e.g. the different threads can have radically different variable numbers. They could have renumbered the variables differently, added new and different BVA variables, etc. All of this is hidden away and the user is presented with and always communicates using his/her own variable numbers.

Final notes

This simple multi-threading system means that I can now port my old MPI code to use the new CryptoMiniSat. Maybe, once again, CryptoMiniSat will run on a cluster. Fun times ahead!