Learning Variable Ordering Heuristics for Solving Constraint Satisfaction Problems

Backtracking search algorithms are often used to solve the Constraint Satisfaction Problem (CSP). The efficiency of backtracking search depends greatly on the variable ordering heuristics. Currently, the most commonly used heuristics are hand-crafted based on expert knowledge. In this paper, we propose a deep reinforcement learning based approach to automatically discover new variable ordering heuristics that are better adapted for a given class of CSP instances. We show that directly optimizing the search cost is hard for bootstrapping, and propose to optimize the expected cost of reaching a leaf node in the search tree. To capture the complex relations among the variables and constraints, we design a representation scheme based on Graph Neural Network that can process CSP instances with different sizes and constraint arities. Experimental results on random CSP instances show that the learned policies outperform classical hand-crafted heuristics in terms of minimizing the search tree size, and can effectively generalize to instances that are larger than those used in training.

[1]  Andrew Lim,et al.  Learning Improvement Heuristics for Solving Routing Problems , 2019 .

[2]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[3]  Susan L. Epstein,et al.  Learning to Solve Constraint Problems , 2007 .

[4]  Roman Barták,et al.  Introduction: Special issue on constraint satisfaction techniques for planning and scheduling problems , 2008, Eng. Appl. Artif. Intell..

[5]  David Applegate,et al.  Finding Cuts in the TSP (A preliminary report) , 1995 .

[6]  Horst Samulowitz,et al.  Learning to Solve QBF , 2007, AAAI.

[7]  Peter van Beek,et al.  Backtracking Search Algorithms , 2006, Handbook of Constraint Programming.

[8]  Birgit Vogel-Heuser,et al.  A configurable partial-order planning approach for field level operation strategies of PLC-based industry 4.0 automated manufacturing systems , 2017, Eng. Appl. Artif. Intell..

[9]  Andrew Lim,et al.  Learning Improvement Heuristics for Solving the Travelling Salesman Problem , 2019, ArXiv.

[10]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[11]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.

[12]  Vitaly Levdik,et al.  Time Limits in Reinforcement Learning , 2017, ICML.

[13]  Paola Mello,et al.  Model Agnostic Solution of CSPs via Deep Learning: A Preliminary Study , 2018, CPAIOR.

[14]  Zhuwen Li,et al.  Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search , 2018, NeurIPS.

[15]  He He,et al.  Learning to Search in Branch and Bound Algorithms , 2014, NIPS.

[16]  Thierry Petit,et al.  Enriching Solutions to Combinatorial Problems via Solution Engineering , 2019, INFORMS J. Comput..

[17]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[18]  Pierre Schaus,et al.  Compact-Table: Efficiently Filtering Table Constraints with Reversible Sparse Bit-Sets , 2016, CP.

[19]  Jie Zhang,et al.  Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems , 2020, AAAI.

[20]  Alexander Felfernig,et al.  An overview of machine learning techniques in constraint solving , 2021, Journal of Intelligent Information Systems.

[21]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[22]  Robert M. Haralick,et al.  Increasing Tree Search Efficiency for Constraint Satisfaction Problems , 1979, Artif. Intell..

[23]  John N. Hooker,et al.  Testing heuristics: We have it all wrong , 1995, J. Heuristics.

[24]  Yoshua Bengio,et al.  Hybrid Models for Learning to Branch , 2020, NeurIPS.

[25]  Yoshua Bengio,et al.  Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies , 2020, AAAI.

[26]  Jie Zhang,et al.  Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning , 2020, NeurIPS.

[27]  Michail G. Lagoudakis,et al.  Algorithm Selection using Reinforcement Learning , 2000, ICML.

[28]  Louis-Martin Rousseau,et al.  SeaPearl: A Constraint Programming Solver guided by Reinforcement Learning , 2021, CPAIOR.

[29]  J. Christopher Beck,et al.  Trying Again to Fail-First , 2004, CSCLP.

[30]  Zhiguang Cao,et al.  Step-Wise Deep Learning Models for Solving Routing Problems , 2020, IEEE Transactions on Industrial Informatics.

[31]  Paolo Liberatore,et al.  On the complexity of choosing the branching literal in DPLL , 2000, Artif. Intell..

[32]  Andy M. Ham,et al.  Integrated scheduling of m-truck, m-drone, and m-depot constrained by time-window, drop-pickup, and m-visit using constraint programming , 2018, Transportation Research Part C: Emerging Technologies.

[33]  Gilbert Laporte,et al.  A Hybrid Tabu Search and Constraint Programming Algorithm for the Dynamic Dial-a-Ride Problem , 2012, INFORMS J. Comput..

[34]  Alessandro Checco,et al.  Fast, Responsive Decentralized Graph Coloring , 2017, IEEE/ACM Transactions on Networking.

[35]  Louis-Martin Rousseau,et al.  Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization , 2020, AAAI.

[36]  Peter Jonsson,et al.  The Complexity of Phylogeny Constraint Satisfaction Problems , 2015, ACM Trans. Comput. Log..

[37]  Ralph Lange,et al.  A Constraint Programming Approach to Simultaneous Task Allocation and Motion Scheduling for Industrial Dual-Arm Manipulation Tasks , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[38]  Hongzi Mao,et al.  Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.

[39]  Enrico Zio,et al.  A new hybrid model for wind speed forecasting combining long short-term memory neural network, decomposition methods and grey wolf optimizer , 2021, Appl. Soft Comput..

[40]  Philippe Refalo,et al.  Impact-Based Search Strategies for Constraint Programming , 2004, CP.

[41]  Christian Bessiere,et al.  Multi-Armed Bandits for Adaptive Constraint Propagation , 2015, IJCAI.

[42]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[43]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[44]  Philippe Baptiste,et al.  Constraint - based scheduling : applying constraint programming to scheduling problems , 2001 .

[45]  Simon de Givry,et al.  Computational protein design as an optimization problem , 2014, Artif. Intell..

[46]  Pedro Barahona,et al.  On the Efficiency of Impact Based Heuristics , 2008, CP.

[47]  Le Song,et al.  Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction , 2019, AAAI.

[48]  Xiaodong Li,et al.  A Scalable Approach to Capacitated Arc Routing Problems Based on Hierarchical Decomposition , 2017, IEEE Transactions on Cybernetics.

[49]  Yu Qian,et al.  A Multi-task Selected Learning Approach for Solving 3D Flexible Bin Packing Problem , 2018, AAMAS.

[50]  Eugene C. Freuder,et al.  The Complexity of Constraint Satisfaction Revisited , 1993, Artif. Intell..

[51]  Christian Bessiere,et al.  Constraint Propagation , 2006, Handbook of Constraint Programming.

[52]  Le Song,et al.  Learning to Branch in Mixed Integer Programming , 2016, AAAI.

[53]  Markus Weimer,et al.  Learning To Solve Circuit-SAT: An Unsupervised Differentiable Approach , 2018, ICLR.

[54]  Peter J. Stuckey,et al.  The MiniZinc Challenge 2008-2013 , 2014, AI Mag..

[55]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[56]  Ke Xu,et al.  Random constraint satisfaction: Easy generation of hard (satisfiable) instances , 2007, Artif. Intell..

[57]  Michail G. Lagoudakis,et al.  Learning to Select Branching Rules in the DPLL Procedure for Satisfiability , 2001, Electron. Notes Discret. Math..

[58]  Christian Bessiere,et al.  MAC and Combined Heuristics: Two Reasons to Forsake FC (and CBJ?) on Hard Problems , 1996, CP.

[59]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[60]  Russell Greiner,et al.  Perturbed message passing for constraint satisfaction problems , 2014, J. Mach. Learn. Res..

[61]  Andrea Lodi,et al.  Exact Combinatorial Optimization with Graph Convolutional Neural Networks , 2019, NeurIPS.

[62]  Toby Walsh,et al.  Handbook of Constraint Programming , 2006, Handbook of Constraint Programming.

[63]  David H. Stern,et al.  Learning Adaptation to Solve Constraint Satisfaction Problems , 2009 .

[64]  Handing Wang,et al.  A Random Forest-Assisted Evolutionary Algorithm for Data-Driven Constrained Multiobjective Combinatorial Optimization of Trauma Systems , 2020, IEEE Transactions on Cybernetics.

[65]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[66]  Zhenbing Liu,et al.  Cost-sensitive deep forest for price prediction , 2020, Pattern Recognit..

[67]  Teresa Zielinska,et al.  A hierarchical CSP search for path planning of cooperating self-reconfigurable mobile fixtures , 2014, Eng. Appl. Artif. Intell..

[68]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[69]  Yoshua Bengio,et al.  Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..

[70]  Mark Johnston,et al.  Automatic Programming via Iterated Local Search for Dynamic Job Shop Scheduling , 2015, IEEE Transactions on Cybernetics.

[71]  George L. Nemhauser,et al.  Learning to Run Heuristics in Tree Search , 2017, IJCAI.

[72]  Jie Zhang,et al.  A Sampling Approach for Proactive Project Scheduling under Generalized Time-dependent Workability Uncertainty , 2019, J. Artif. Intell. Res..

[73]  Yanchun Liang,et al.  Improving degree-based variable ordering heuristics for solving constraint satisfaction problems , 2015, J. Heuristics.

[74]  Toby Walsh,et al.  An Empirical Study of Dynamic Variable Ordering Heuristics for the Constraint Satisfaction Problem , 1996, CP.

[75]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.