Optimisations | Wonderings of a SAT geek

There are plenty of optimisations that I have tried with CryptoMiniSat. I will try to list a few that I finally didn’t consider using, but could be implemented later to get around 4-8% speedup each.

It seems a good idea to use a separate CNF simplifier such as SatELite before executing the SAT solver. The reasoning is, that even though one can import (or copy-paste) the code from SatELite, the memory will stay fragmented: variable elimination might remove half the variables, and so the arrays that store variable values will have unused bytes in them, using a larger footprint than necessary, thus causing cache misses. Though this is true, SatELite is not exactly perfect. For instance, it can be seen in certain instances of the SAT Competition of 2009 that GLUCOSE sometimes had very bad performance in comparison with other solvers exactly because SatELite took, e.g. 130 seconds, but the solving only took 2 seconds. So SatELite slowed down the solving of this instance. Also, SatELite has some limitations, e.g. it cannot handle more than 4’800’000 clauses otherwise it might take just too much time. These are huge drawbacks, and in fact CryptoMiniSat (and, e.g. PrecoSat) overcomes all of these. The memory fragmentation issue remains, though. It could be treated, but that would require about 2-3 weeks of time to get corrected, and I simply don’t have that much time. P.S.: Easy solution: rename the variables everywhere. Hard (but less error-prone) solution: use different datastructure for variable values.

Another possible optimisation that never made it to CryptoMiniSat (though I have a branch in git that achieves it more-or-less) is the close packing of clauses in memory. It can greatly increase the cache-hit ratio, though only for relatively small (<20-40 MByte) problems. The reason why I never got this done is that it is a source of a lot of potential problems. Firstly, there is no point in packing clause sizes that are rare. The most common clause size is 2 (see SAT Competition instances), so it’s only logical to pack these clauses. However, this presents a problem: how should the clause be freed after usage? If it was a packed clause, it needs to be freed specially (from the packed heap). But sometimes, non-packed clauses shrink to size 2, so they should become packed. Must these clauses be freed, and then re-allocated from the packed heap? Doesn’t that add too much complexity? Also, according to my experience, packed heaps only help when the problem is small. For large problems, the cache miss rate will be very high anyway, and so packing won’t make much of a difference.

There are some other optimisations that I missed out on, I think. I will one day look through my very messy git tree (with 41 branches and approx. 1600 commits) to see which ones did I once implement and never got working :)

PS: After the start of the SAT Race 2010, I will put all of the git tree up into gitorious. Then, everyone can use it and try to merge these trees into the main tree, and see what happens.