论文信息 - Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning

Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning

We present GQSAT, a branching heuristic in a Boolean SAT solver trained with value-based reinforcement learning (RL) using Graph Neural Networks for function approximation. Solvers using GQSAT are complete SAT solvers that either provide a satisfying assignment or a proof of unsatisfiability, which is required for many SAT applications. The branching heuristic commonly used in SAT solvers today suffers from bad decisions during their warm-up period, whereas GQSAT has been trained to examine the structure of the particular problem instance to make better decisions at the beginning of the search. Training GQSAT is data efficient and does not require elaborate dataset preparation or feature engineering to train. We train GQSAT on small SAT problems using RL interfacing with an existing SAT solver. We show that GQSAT is able to reduce the number of iterations required to solve SAT problems by 2-3X, and it generalizes to unsatisfiable SAT instances, as well as to problems with 5X more variables than it was trained on. We also show that, to a lesser extent, it generalizes to SAT problems from different domains by evaluating it on graph coloring. Our experiments show that augmenting SAT solvers with agents trained with RL and graph neural networks can improve performance on the SAT search problem.

[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[2] Kevin Leyton-Brown,et al. SATzilla: Portfolio-based Algorithm Selection for SAT , 2008, J. Artif. Intell. Res..

[3] Nikolaj Bjørner,et al. Guiding High-Performance SAT Solvers with Unsat-Core Predictions , 2019, SAT.

[4] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[5] Sumit Kumar,et al. Learning Transferable Cooperative Behavior in Multi-Agent Teams , 2019, AAMAS.

[6] Thierry Coppey,et al. SmartChoices: Hybridizing Programming and Machine Learning , 2019 .

[7] Krzysztof Czarnecki,et al. Learning Rate Based Branching Heuristic for SAT Solvers , 2016, SAT.

[8] Niklas Een,et al. MiniSat v1.13 - A SAT Solver with Conflict-Clause Minimization , 2005 .

[9] Daniel Kudenko,et al. Deep Multi-Agent Reinforcement Learning with Relevance Graphs , 2018, ArXiv.

[10] Thomas Stützle,et al. SATLIB: An Online Resource for Research on SAT , 2000 .

[11] Cristian Grozea,et al. Can Machine Learning Learn a Decision Oracle for NP Problems? A Test on SAT , 2014, Fundam. Informaticae.

[12] Fei Wang,et al. From Gameplay to Symbolic Reasoning , 2018 .

[13] Peter C. Cheeseman,et al. Where the Really Hard Problems Are , 1991, IJCAI.

[14] Sebastian Fischmeister,et al. Impact of Community Structure on SAT Solver Performance , 2014, SAT.

[15] Markus Weimer,et al. Learning To Solve Circuit-SAT: An Unsupervised Differentiable Approach , 2018, ICLR.

[16] Joao Marques-Silva,et al. GRASP: A Search Algorithm for Propositional Satisfiability , 1999, IEEE Trans. Computers.

[17] Joao Marques-Silva,et al. Empirical Study of the Anatomy of Modern Sat Solvers , 2011, SAT.

[18] Sanja Fidler,et al. NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.

[19] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[20] Zongqing Lu,et al. Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation , 2018, ArXiv.

[21] Jessica B. Hamrick,et al. Structured agents for physical construction , 2019, ICML.

[22] Matthew B. Blaschko,et al. Perceptron Learning of SAT , 2012, NIPS.

[23] Martin Rinard,et al. AvatarSAT: An Auto-tuning Boolean SAT Solver , 2009 .

[24] F. Scarselli,et al. A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[25] Roberto J. Bayardo,et al. Using CSP Look-Back Techniques to Solve Real-World SAT Instances , 1997, AAAI/IAAI.

[26] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[27] Toby Walsh,et al. Restart Strategy Selection Using Machine Learning Techniques , 2009, SAT.

[28] Jan Eric Lenssen,et al. Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[29] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[30] Peter J. Stuckey,et al. Propagation via lazy clause generation , 2009, Constraints.

[31] David L. Dill,et al. Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[32] Le Song,et al. 2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[33] Sanjit A. Seshia,et al. Learning Heuristics for Automated Reasoning through Deep Reinforcement Learning , 2018, ArXiv.

[34] Raia Hadsell,et al. Graph networks as learnable physics engines for inference and control , 2018, ICML.

[35] Sarah M. Loos,et al. Graph Representations for Higher-Order Logic and Theorem Proving , 2019, AAAI.

[36] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.