Rethinking the parallelization of random-restart hill climbing: a case study in optimizing a 2-opt TSP solver for GPU execution

Random-restart hill climbing is a common approach to combinatorial optimization problems such as the traveling salesman problem (TSP). We present and evaluate an implementation of random-restart hill climbing with 2-opt local search applied to TSP. Our implementation is capable of addressing large problem sizes at high throughput. It is based on the key insight that the GPU’s hierarchical hardware parallelism is best exploited with a hierarchical implementation strategy, where independent climbs are parallelized between blocks and the 2-opt evaluations are parallelized across the threads within a block. We analyze the performance impact of this and other optimizations on our heuristic TSP solver and compare its performance to existing GPU-based 2-opt TSP solvers as well as a parallel CPU implementation. Our code outperforms the existing implementations by up to 3X, evaluating up to 60 billion 2-opt moves per second on a single K40 GPU. It also outperforms an OpenMP implementation run on 20 CPU cores by up to 8X.

[1]  Exnar Filip The Travelling Salesman Problem and its Application in Logistic Practice , 2011 .

[2]  F. Glover,et al.  Local Search and Metaheuristics , 2007 .

[3]  Martin Burtscher,et al.  A Parallel GPU Version of the Traveling Salesman Problem , 2011 .

[4]  G. Croes A Method for Solving Traveling-Salesman Problems , 1958 .

[5]  Kamil Rocki,et al.  Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[6]  Keshav Pingali,et al.  The tao of parallelism in algorithms , 2011, PLDI '11.

[7]  Michaël Krajecki,et al.  Parallel GPU Implementation of Iterated Local Search for the Travelling Salesman Problem , 2012, LION.

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Georg Hager,et al.  Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[10]  Koji Nakano,et al.  An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem , 2012, 2012 Third International Conference on Networking and Computing.

[11]  Kamil Rocki,et al.  An efficient GPU implementation of a multi-start TSP solver for large problem instances , 2012, GECCO '12.

[12]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[13]  Ximing Li,et al.  MAX-MIN Ant System on GPU with CUDA , 2009, 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC).

[14]  Hai Jiang,et al.  CUDA-Based Genetic Algorithm on Traveling Salesman Problem , 2011 .

[15]  Martyn Amos,et al.  Enhancing data parallelism for Ant Colony Optimization on GPUs , 2013, J. Parallel Distributed Comput..

[16]  Shigeyoshi Tsutsui,et al.  A Highly-Parallel TSP Solver for a GPU Computing Platform , 2010, NMA.

[17]  Kamil Rocki,et al.  High Performance GPU Accelerated Local Optimization in TSP , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[18]  Marc Gravel,et al.  Parallel Ant Colony Optimization on Graphics Processing Units , 2013, J. Parallel Distributed Comput..

[19]  Guy E. Blelloch,et al.  Programming parallel algorithms , 1996, CACM.

[20]  Jan Karel Lenstra,et al.  Some Simple Applications of the Travelling Salesman Problem , 1975 .

[21]  Yann-Chang Huang,et al.  The Fourth International Conference on Innovative Computing, Information and Control , 2009 .

[22]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[23]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[24]  R. Agarwala,et al.  A fast and scalable radiation hybrid map construction and integration strategy. , 2000, Genome research.

[25]  David S. Johnson,et al.  The Traveling Salesman Problem: A Case Study in Local Optimization , 2008 .