Attention Solves Your TSP

We propose a framework for solving combinatorial optimization problems of which the output can be represented as a sequence of input elements. As an alternative to the Pointer Network, we parameterize a policy by a model based entirely on (graph) attention layers, and train it efficiently using REINFORCE with a simple and robust baseline based on a deterministic (greedy) rollout of the best policy found during training. We significantly improve over state-of-the-art results for learning algorithms for the 2D Euclidean TSP, reducing the optimality gap for a single tour construction by more than 75% (to 0.33%) and 50% (to 2.28%) for instances with 20 and 50 nodes respectively.

[1]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[4]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[5]  Kate Smith-Miles,et al.  Neural Networks for Combinatorial Optimization: A Review of More Than a Decade of Research , 1999, INFORMS J. Comput..

[6]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[9]  Daniel J. Rosenkrantz,et al.  An Analysis of Several Heuristics for the Traveling Salesman Problem , 1977, SIAM J. Comput..

[10]  T. Neumann Computers And Intractability A Guide To The Theory Of Np Completeness , 2016 .

[11]  Le Song,et al.  Learning Combinatorial Optimization Algorithms over Graphs , 2017, NIPS.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  J. J. Hopfield,et al.  “Neural” computation of decisions in optimization problems , 1985, Biological Cybernetics.

[14]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[16]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[17]  Joan Bruna,et al.  A Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks , 2017, ArXiv.