Parallelisation

I have lately been reading the book Using OpenMP and thinking about how CryptoMiniSat could be parallelised. Apparently, there are multiple ways to achieve it. I haven’t yet had time to read through all previous approaches, but it seems there are at least two main ways to distribute the solving: through guiding paths and using the idea of ManySat.

The first approach, guiding paths, decomposes the problem into multiple sub-problems, setting some variables to true or false. The problems are then solved completely independently with a master coordinating the effort. I don’t like this approach for two reasons. Firstly, for cryptographic problems, the decomposition can be made very well since the problem in its most abstract form (i.e. key, plaintext, ciphertext) is known, and so a very good guiding path can be calculated. However, even with this very good guiding path, the expected total time to solve is far more than if the problem was given to one solver instance. In other words, if we decompose the problem into 4 sub-problems, and each sub-problem takes on average T time to solve, then the total average time is 4T, but if we gave the problem to one solver, it would have found the solution on average in e.g. 2T time. In other words, the gain is not as much as one would hope. I would guess that this ratio (2T/4T = 1/2 in this case) is even worse when the problem’s abstract form is totally unknown (which is most of the time), and the solver has to guess a guiding path.

The second approach, that of ManySat is to start the same solver with the same problem multiple times, but use different restart heuristics to solve the problem. Since modern DPLL-based SAT solvers use randomised algorithms, this helps to reduce the expected time to solve an instance. Also, the solvers share short clauses with one another to work in a more collaborative manner. However, I believe that given the distribution of the solving time of a problem with a given solver, even if more instances of it are launched (each with a random seed and different restart strategy), the minimum solving time of all launched solvers will not be as small as one would hope. It is extremely rare that a difficult instance can be solved in under e.g. 1 minute if the average time to solve it is 1000 minutes. So, if we were to launch 1000 solver instances, probably none would finish before 2-300 minutes. Of course I am not counting the effect of shared learnt clauses here, but I am not very convinced about them. Apparently, activity-based learnt clause sharing (instead of size-based) is more effective than size-based (see this), and ManySat shares them according to the size.

So, if all else fails, what is there left to do? Well, I bought the OpenMP book to find out. I wish to implement the idea of distributing clauses to different threads, thus distributing the BCP (Boolean Constraint Propagation) load. However, multi-threading can only go so far, so MPI will eventually need to be used, along with multiple instances of the solver on the same physical CPU, the way ManySAT is doing it — though I will try to implement some form of clause-sharing. The good thing is, that implementing MPI will also bring the possibility of running a problem instance on a huge cluster, something I am really looking forward to.

EDITED TO ADD (16/07/2010): I just realised that MiraXT does quite a number of things that I wish to implement into CryptoMiniSat.