Machine Learning and SAT | Wonderings of a SAT geek

I have lately been digging myself into a deep hole with machine learning. While doing that it occurred to me that the SAT community has essentially been trying to imitate some of ML in a somewhat poor way. Let me explain.

CryptoMiniSat and clause cleaning strategy selection

When CryptoMiniSat won the SAT Race of 2010, it was in large part because I realized that glucose at the time was essentially unable to solve cryptographic problems. I devised a system where I could detect which problems were cryptographic. It checked the activity stability of variables and if they were more stable than a threshold, it was decided that the problem was cryptographic. Cryptographic problems were then solved using a geometric restart strategy with clause activities for learnt database cleaning. Without this hack, it would have been impossible to win the competition.

It is clear that there could have been a number of ways to detect that a problem is cryptographic without using such an elaborate scheme. However, that would have demanded a mixture of more features to decide. The scheme only used the average and the standard deviation.

Lingeling and clause cleaning strategy selection

The decision made by lingeling about whether to use glues or activities to clean learnt clauses is somewhat similar to my approach above. It calculates the average and the standard deviation of the learnt clauses’ glues and then makes a decision. Looking at the code, the option actavgmax/stdmin/stdmax gives the cutoffs and the function lglneedacts calculates the values and decides. This has been in lingeling since 2011 (lingeling-587f).

Probably a much better decision could be made if more data was taken into account (e.g. activities) but as a human, it’s simply hard to make a decision based on more than 2-3 pieces of data.

Enter machine learning

It is clear that the above schemes were basically trying to extract some feature from the SAT solver and then decide what features (glues/activities) to use to clear the learnt clause database. It is also clear that both have been extremely effective, it’s by no luck that they have been inside successful SAT solvers.

The question is, can we do better? I think yes. First of all, we don’t need to cut the problem into two steps. Instead, we can integrate the features extracted from the solver (variable activities, clause glue distribution, etc) and the features from the clause (glue, activities, etc.) and make a decision whether to keep the clause or not. This means we would make keep/throwaway decisions on individual clauses based on potentially hundreds of extracted features — and that requires machine learning. But it would allow much more than what we have now. By giving a large number of features to the machine learning algorithm, it could extract interesting high-level features from them. In particular, deep learning could possibly have come up with glues(!) if the decision levels of the variables (along with the current decision level) were given to it. I think there must be more and better higher-level features than just glues.

It’s highly inefficient to ask humans to come up with high-level features and decision trees for clause keep/throwaway when there are readily-available systems that can extract high-level features and make immensely complex (and effective!) decision trees in very reasonable amount of time on modern machines.

Conclusions

I think we have been wasting research and development time on coming up with high-level features and decision trees for keeping or throwing away learnt clauses. It’s time we looked at a discipline that has been exploding in computer science and stopped mocking SATZilla — effectively, all modern SAT solvers employ a hand-crafted poor man’s version of some of SATZilla.

Post Scriptum

This file in the SAT Race 2015 version of CryptoMiniSat will be super-controversial. It on-the-fly reconfigures CryptoMiniSat after ~160K of conflicts and three runs of pre- and in-processing based on over 50 extracted features. It’s crappy. And effective. I did it in one week, over the CAV conference. It simulates the decision by CryptoMiniSat and lingeling about activities/glues, and more. Thanks Ofer Strichman!