Optimization of parallel iterated local search algorithms on graphics processing unit

Local search metaheuristics (LSMs) are efficient methods for solving hard optimization problems in science, engineering, economics and technology. By using LSMs, we could obtain satisfactory resolution (approximate optimum) in a reasonable time. However, it is still very CPU time-consuming when solving large problem instances. As graphic process units (GPUs) have been evolved to support general purpose computing, they are taken as a major accelerator in scientific and industrial computing. In this paper, we present an optimized parallel iterated local search algorithm efficiently accelerated on GPUs and test the algorithm with a typical case study of the Travelling Salesman Problem (TSP) in computational science. We introduce novel methods as follows: first, we present an efficient mapping between a neighborhood and a GPU thread. Second, we use the Roofline model to analyze the performance of existing GPU-based 2-opt kernels. Based on our analysis, we point out the limiting factor of these 2-opt kernels and provide our optimization approaches. Furthermore, we test our algorithm with standard TSP problem instances up to 4461 cities, in which our strategy leads to a speedup factor 279$$\times $$× over the sequential counterpart. We compare our approach with existing high-performance GPU-based local search algorithms, and the results demonstrate that the proposed algorithm is competitive.

[1]  Enrique Alba,et al.  Parallel metaheuristics: recent advances and new trends , 2012, Int. Trans. Oper. Res..

[2]  Xiao Chen,et al.  Real-time object tracking via compressive feature selection , 2016, Frontiers of Computer Science.

[3]  Fazhi He,et al.  Service-Oriented Feature-Based Data Exchange for Cloud-Based Design and Manufacturing , 2018, IEEE Transactions on Services Computing.

[4]  Amirreza Zarrabi,et al.  Gravitational search algorithm using CUDA: a case study in high-performance metaheuristics , 2014, The Journal of Supercomputing.

[5]  Manuel Laguna,et al.  Tabu Search , 1997 .

[6]  Michaël Krajecki,et al.  Parallel GPU Implementation of Iterated Local Search for the Travelling Salesman Problem , 2012, LION.

[7]  El-Ghazali Talbi,et al.  GPU Computing for Parallel Local Search Metaheuristic Algorithms , 2013, IEEE Transactions on Computers.

[8]  Shahryar Rahnamayan,et al.  Metaheuristics in large-scale global continues optimization: A survey , 2015, Inf. Sci..

[9]  Philippe Codognet,et al.  A GPU Implementation of Parallel Constraint-Based Local Search , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[10]  S. S. Sengupta,et al.  The traveling salesman problem , 1961 .

[11]  Kamil Rocki,et al.  Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[12]  Zhiyong Yuan,et al.  An efficient improved particle swarm optimization based on prey behavior of fish schooling , 2015 .

[13]  José M. García,et al.  Comparative evaluation of platforms for parallel Ant Colony Optimization , 2014, The Journal of Supercomputing.

[14]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[15]  Kamil Rocki,et al.  High Performance GPU Accelerated Local Optimization in TSP , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[16]  Marc Gravel,et al.  Parallel Ant Colony Optimization on Graphics Processing Units , 2013, J. Parallel Distributed Comput..

[17]  Yun-Chia Liang,et al.  A novel metaheuristic for continuous optimization problems: Virus optimization algorithm , 2016 .

[18]  Jason Maassen,et al.  Optimizing convolution operations on GPUs using adaptive tiling , 2014, Future Gener. Comput. Syst..

[19]  Fazhi He,et al.  A hybrid optimization approach for sustainable process planning and scheduling , 2018, Integr. Comput. Aided Eng..

[20]  T. Koopmans,et al.  Assignment Problems and the Location of Economic Activities , 1957 .

[21]  Martin Burtscher,et al.  Rethinking the parallelization of random-restart hill climbing: a case study in optimizing a 2-opt TSP solver for GPU execution , 2015, GPGPU@PPoPP.

[22]  Jing Liu,et al.  Efficient random saliency map detection , 2010, Science China Information Sciences.

[23]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[24]  Ki-Hwan Kim,et al.  Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model , 2011, Comput. Phys. Commun..

[25]  Yuan Cheng,et al.  Meta-operation conflict resolution for human–human interaction in collaborative feature-based CAD systems , 2016, Cluster Computing.

[26]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[27]  T. Stützle,et al.  Iterated Local Search: Framework and Applications , 2018, Handbook of Metaheuristics.

[28]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[29]  Pandian Vasant,et al.  Meta-Heuristics Optimization Algorithms in Engineering, Business, Economics, and Finance , 2012 .

[30]  W. D. Li,et al.  Encryption based partial sharing of CAD models , 2015, Integr. Comput. Aided Eng..

[31]  Juraj Fosin,et al.  A GPU implementation of local search operators for symmetric travelling salesman problem , 2013 .

[32]  Enrique Alba,et al.  A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems , 2015, The Journal of Supercomputing.

[33]  Eugene L. Lawler,et al.  Traveling Salesman Problem , 2016 .

[34]  O. Hasançebi,et al.  Bat inspired algorithm for discrete size optimization of steel frames , 2014, Adv. Eng. Softw..

[35]  Kamil Rocki,et al.  Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem , 2012, HPCS.

[36]  El-Ghazali Talbi,et al.  A GPU-based iterated tabu search for solving the quadratic 3-dimensional assignment problem , 2010, ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010.

[37]  Fazhi He,et al.  Quantitative optimization of interoperability during feature-based data exchange , 2015, Integr. Comput. Aided Eng..

[38]  Didier El Baz,et al.  Recent Advances on GPU Computing in Operations Research , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[39]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[40]  Ardeshir Bahreininejad,et al.  Optimization of mixed integer nonlinear economic lot scheduling problem with multiple setups and shelf life using metaheuristic algorithms , 2014, Adv. Eng. Softw..