Hybrid pointer networks for traveling salesman problems optimization

In this work, we proposed a hybrid pointer network (HPN), an end-to-end deep reinforcement learning architecture is provided to tackle the travelling salesman problem (TSP). HPN builds upon graph pointer networks, an extension of pointer networks with an additional graph embedding layer. HPN combines the graph embedding layer with the transformer’s encoder to produce multiple embeddings for the feature context. We conducted extensive experimental work to compare HPN and Graph pointer network (GPN). For the sack of fairness, we used the same setting as proposed in GPN paper. The experimental results show that our network significantly outperforms the original graph pointer network for small and large-scale problems. For example, it reduced the cost for travelling salesman problems with 50 cities/nodes (TSP50) from 5.959 to 5.706 without utilizing 2opt. Moreover, we solved benchmark instances of variable sizes using HPN and GPN. The cost of the solutions and the testing times are compared using Linear mixed effect models. We found that our model yields statistically significant better solutions in terms of the total trip cost. We make our data, models, and code publicly available https://github.com/AhmedStohy/Hybrid-Pointer-Networks.

[1]  Xavier Bresson,et al.  An Efficient Graph Convolutional Network Technique for the Travelling Salesman Problem , 2019, ArXiv.

[2]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[3]  Max Welling,et al.  Buy 4 REINFORCE Samples, Get a Baseline for Free! , 2019, DeepRLStructPred@ICLR.

[4]  Lawrence Bodin,et al.  Approximate Traveling Salesman Algorithms , 1980, Oper. Res..

[5]  Keld Helsgaun,et al.  An effective implementation of the Lin-Kernighan traveling salesman heuristic , 2000, Eur. J. Oper. Res..

[6]  Xavier Bresson,et al.  The Transformer Network for the Traveling Salesman Problem , 2021, ArXiv.

[7]  Gregory Gutin,et al.  The traveling salesman problem , 2006, Discret. Optim..

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[10]  Yoshua Bengio,et al.  Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..

[11]  Sally Andrews,et al.  To transform or not to transform: using generalized linear mixed models to analyse reaction time data , 2015, Front. Psychol..

[12]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[13]  Alexandre Lacoste,et al.  Learning Heuristics for the TSP by Policy Gradient , 2018, CPAIOR.

[14]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[16]  Nicos Christofides Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem , 1976, Operations Research Forum.

[17]  Lei Gao,et al.  Dynamic Partial Removal: A Neural Network Heuristic for Large Neighborhood Search , 2020, ArXiv.

[18]  Vaibhava Goel,et al.  Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[20]  Edward P. K. Tsang,et al.  Guided local search and its application to the traveling salesman problem , 1999, Eur. J. Oper. Res..

[21]  Qiang Ma,et al.  Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning , 2019, ArXiv.

[22]  Andrew Lim,et al.  Learning Improvement Heuristics for Solving Routing Problems , 2019 .

[23]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.

[24]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[25]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.