A Learning-based Iterative Method for Solving Vehicle Routing Problems

This paper is concerned with solving combinatorial optimization problems, in particular, the capacitated vehicle routing problems (CVRP). Classical Operations Research (OR) algorithms such as LKH3 (Helsgaun, 2017) are extremely inefficient (e.g., 13 hours on CVRP of only size 100) and difficult to scale to larger-size problems. Machine learning based approaches have recently shown to be promising, partly because of their efficiency (once trained, they can perform solving within minutes or even seconds). However, there is still a considerable gap between the quality of a machine learned solution and what OR methods can offer (e.g., on CVRP-100, the best result of learned solutions is between 16.10-16.80, significantly worse than LKH3's 15.65). In this paper, we present the first learning based approach for CVRP that is efficient in solving speed and at the same time outperforms OR methods. Starting with a random initial solution, our algorithm learns to iteratively refines the solution with an improvement operator, selected by a reinforcement learning based controller. The improvement operator is selected from a pool of powerful operators that are customized for routing problems. By combining the strengths of the two worlds, our approach achieves the new state-of-the-art results on CVRP, e.g., an average cost of 15.57 on CVRP-100.

[1]  J. F. Pierce,et al.  ON THE TRUCK DISPATCHING PROBLEM , 1971 .

[2]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[3]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[4]  Keld Helsgaun,et al.  An effective implementation of the Lin-Kernighan traveling salesman heuristic , 2000, Eur. J. Oper. Res..

[5]  Paolo Toth,et al.  Models, relaxations and exact approaches for the capacitated vehicle routing problem , 2002, Discret. Appl. Math..

[6]  Yuandong Tian,et al.  Learning to Perform Local Rewriting for Combinatorial Optimization , 2019, NeurIPS.

[7]  Lior Wolf,et al.  Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks , 2018, ArXiv.

[8]  Thibaut Vidal,et al.  New benchmark instances for the Capacitated Vehicle Routing Problem , 2017, Eur. J. Oper. Res..

[9]  DAVID ÖDLING A metaheuristic for vehicle routing problems based on reinforcement learning , 2018 .

[10]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[11]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.

[12]  Nicolas Jozefowiez,et al.  The vehicle routing problem: Latest advances and new challenges , 2007 .

[13]  G. Clarke,et al.  Scheduling of Vehicles from a Central Depot to a Number of Delivery Points , 1964 .

[14]  Alexandre Lacoste,et al.  Learning Heuristics for the TSP by Policy Gradient , 2018, CPAIOR.

[15]  Ramasamy Panneerselvam,et al.  A Survey on the Vehicle Routing Problem and Its Variants , 2012 .

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Renato F. Werneck,et al.  Robust Branch-and-Cut-and-Price for the Capacitated Vehicle Routing Problem , 2004, Math. Program..

[18]  Keld Helsgaun,et al.  An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems: Technical report , 2017 .

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Zhuwen Li,et al.  Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search , 2018, NeurIPS.

[21]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[22]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[23]  Paolo Toth,et al.  Vehicle Routing , 2014, Vehicle Routing.

[24]  Andrea Lodi,et al.  On learning and branching: a survey , 2017 .

[25]  Michele Lombardi,et al.  Boosting Combinatorial Problem Modeling with Machine Learning , 2018, IJCAI.

[26]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[27]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[28]  Kate Smith-Miles,et al.  Neural Networks for Combinatorial Optimization: A Review of More Than a Decade of Research , 1999, INFORMS J. Comput..

[29]  Michel Gendreau,et al.  Hyper-heuristics: a survey of the state of the art , 2013, J. Oper. Res. Soc..

[30]  Yoshua Bengio,et al.  Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..