A Survey on GPU-Based Implementation of Swarm Intelligence Algorithms

Inspired by the collective behavior of natural swarm, swarm intelligence algorithms (SIAs) have been developed and widely used for solving optimization problems. When applied to complex problems, a large number of fitness function evaluations are needed to obtain an acceptable solution. To tackle this vital issue, graphical processing units (GPUs) have been used to accelerate the optimization procedure of SIAs. Thanks to their inherent parallelism, SIAs are very suitable for parallel implementation under the GPU platform which have achieved a great success in recent years. This paper presents a comprehensive review of GPU-based parallel SIAs in accordance with a newly proposed taxonomy. Critical concerns for the efficient parallel implementation of SIAs are also described in detail. Moreover, novel criteria are also proposed to evaluate and compare the parallel implementation and algorithm performance universally. The rationality and practicability of the proposed optimization methodology and criteria are verified by careful case study. Finally, our opinions and perspectives on the trends and prospects on the relatively new research domain are also presented for future development.

[1]  Ke Ding,et al.  A GPU-based parallel fireworks algorithm for optimization , 2013, GECCO '13.

[2]  Robert M. Farber,et al.  CUDA Application Design and Development , 2011 .

[3]  Milan Tuba,et al.  Parallelization of the Cuckoo Search using CUDA Architecture , 2013 .

[4]  Chao Zhang,et al.  A Parallel Way to Select the Parameters of SVM Based on the Ant Optimization Algorithm , 2014, ArXiv.

[5]  S.E. Papadakis,et al.  A GPU accelerated PSO with application to Economic Dispatch problem , 2011, 2011 16th International Conference on Intelligent System Applications to Power Systems.

[6]  Ruppa K. Thulasiram,et al.  Normalized particle swarm optimization for complex chooser option pricing on graphics processing unit , 2013, The Journal of Supercomputing.

[7]  Ke Ding,et al.  cuROB: A GPU-Based Test Suit for Real-Parameter Optimization , 2014, ICSI.

[8]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[9]  Debanjan Datta,et al.  CUDA based Particle Swarm Optimization for geophysical inversion , 2012, 2012 1st International Conference on Recent Advances in Information Technology (RAIT).

[10]  Enrique Alba,et al.  Parallel metaheuristics: recent advances and new trends , 2012, Int. Trans. Oper. Res..

[11]  Philippe C. Cattin,et al.  Evaluation of OpenCL native math functions for image processing algorithms , 2013, 2013 XXIV International Conference on Information, Communication and Automation Technologies (ICAT).

[12]  Witold Pedrycz,et al.  Online Parameter Optimization-Based Prediction for Converter Gas System by Parallel Strategies , 2012, IEEE Transactions on Control Systems Technology.

[13]  Shigeyoshi Tsutsui,et al.  ACO with Tabu Search on GPUs for Fast Solution of the QAP , 2013, Massively Parallel Evolutionary Computation on GPGPUs.

[14]  Vincent Roberge,et al.  Comparison of Parallel Particle Swarm Optimizers for Graphical Processing Units and Multicore Processors , 2013, Int. J. Comput. Intell. Appl..

[15]  Stefano Cagnoni,et al.  libCudaOptimize: an open source library of GPU-based metaheuristics , 2012, GECCO '12.

[16]  Chih-Hsing Chu,et al.  Particle swarm optimisation (PSO)-based tool path planning for 5-axis flank milling accelerated by graphics processing unit (GPU) , 2011, Int. J. Comput. Integr. Manuf..

[17]  Ying Tan,et al.  GPU-based parallel particle swarm optimization , 2009, 2009 IEEE Congress on Evolutionary Computation.

[18]  Bogdan Kwolek,et al.  Real-Time Multiview Human Body Tracking Using GPU-Accelerated PSO , 2013, PPAM.

[19]  Vincent Heuveline,et al.  A Survey on Hardware-aware and Heterogeneous Computing on Multicore Processors and Accelerators , 2009 .

[20]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[21]  Sayantan Sur,et al.  MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters , 2011, Computer Science - Research and Development.

[22]  Ying Tan,et al.  Fireworks Algorithm: A Novel Swarm Intelligence Optimization Method , 2015 .

[23]  Jack J. Dongarra,et al.  From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..

[24]  Martyn Amos,et al.  Enhancing GPU parallelism in nature-inspired algorithms , 2012, The Journal of Supercomputing.

[25]  Václav Snásel,et al.  A PSO-based document classification algorithm accelerated by the CUDA Platform , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[26]  Seah Hock Soon,et al.  CUDA Acceleration of 3D Dynamic Scene Reconstruction and 3D Motion Estimation for Motion Capture , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[27]  Robin M. Weiss GPU-Accelerated Ant Colony Optimization , 2011 .

[28]  Sebastián Ventura,et al.  Parallel multi-objective Ant Programming for classification using GPUs , 2013, J. Parallel Distributed Comput..

[29]  Harikrishna Narasimhan,et al.  Parallel artificial bee colony (PABC) algorithm , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[30]  Shigeyoshi Tsutsui,et al.  ACO on Multiple GPUs with CUDA for Faster Solution of QAPs , 2012, PPSN.

[31]  Leandro dos Santos Coelho,et al.  Hardware Particle Swarm Optimization Based on the Attractive-Repulsive Scheme for Embedded Applications , 2010, 2010 International Conference on Reconfigurable Computing and FPGAs.

[32]  Nadia Nedjah,et al.  Swarm Grid: A Proposal for High Performance of Parallel Particle Swarm Optimization Using GPGPU , 2012, ICCSA.

[33]  Ruppa K. Thulasiram,et al.  Portfolio Management Using Particle Swarm Optimization on GPU , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.

[34]  Fabio Daolio,et al.  Evaluation of parallel particle swarm optimization algorithms within the CUDA™ architecture , 2011, Inf. Sci..

[35]  Xin-She Yang,et al.  Swarm intelligence based algorithms: a critical analysis , 2013, Evolutionary Intelligence.

[36]  A. Grimshaw,et al.  High Performance and Scalable Radix Sorting: a Case Study of Implementing Dynamic Parallelism for GPU Computing , 2011, Parallel Process. Lett..

[37]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[38]  Yifan Hu,et al.  Parallel Fish Swarm Algorithm Based on GPU-Acceleration , 2011, 2011 3rd International Workshop on Intelligent Systems and Applications.

[39]  Dionne Cavalcante Monteiro,et al.  A New Cooperative Evolutionary Multi-Swarm Optimizer Algorithm Based on CUDA Architecture Applied to Engineering Optimization , 2013 .

[40]  Ying Tan,et al.  Fireworks Algorithm for Optimization , 2010, ICSI.

[41]  Nadia Nedjah,et al.  Three Alternatives for Parallel GPU-Based Implementations of High Performance Particle Swarm Optimization , 2013, IWANN.

[42]  Jambhlekar Pushkar Arun,et al.  Parallel implementation of MOPSO on GPU using OpenCL and CUDA , 2011, 2011 18th International Conference on High Performance Computing.

[43]  John Sartori,et al.  Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications , 2012, IEEE Transactions on Multimedia.

[44]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[45]  Jeff A. Stuart,et al.  A study of Persistent Threads style GPU programming for GPGPU workloads , 2012, 2012 Innovative Parallel Computing (InPar).

[46]  Leandro dos Santos Coelho,et al.  Comparison between two FPGA implementations of the Particle Swarm Optimization algorithm for high-performance embedded applications , 2010, 2010 IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA).

[47]  Tao Wang,et al.  Deep learning with COTS HPC systems , 2013, ICML.

[48]  Kay Chen Tan,et al.  A Multiobjective Memetic Algorithm Based on Particle Swarm Optimization , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[49]  James Kennedy,et al.  Defining a Standard for Particle Swarm Optimization , 2007, 2007 IEEE Swarm Intelligence Symposium.

[50]  Carmelo J. A. Bastos-Filho,et al.  Analysis of the Performance of the Fish School Search Algorithm Running in Graphic Processing Units , 2012 .

[51]  Petru Eles,et al.  General purpose computing on low-power embedded GPUs: Has it come of age? , 2013, 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[52]  Robin M. Weiss,et al.  Accelerating Swarm Intelligence Algorithms with GPU-Computing , 2013 .

[53]  Weihang Zhu,et al.  Parallel ant colony for nonlinear function optimization with graphics hardware acceleration , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[54]  Arnaldo Cecchini,et al.  Fast and Accurate Optimization of a GPU-accelerated CA Urban Model through Cooperative Coevolutionary Particle Swarms , 2014, ICCS.

[55]  Václav Snásel,et al.  Nature-Inspired Meta-Heuristics on Modern GPUs: State of the Art and Brief Survey of Selected Algorithms , 2013, International Journal of Parallel Programming.

[56]  Ruppa K. Thulasiram,et al.  Collaborative multi-swarm PSO for task matching using graphics processing units , 2011, GECCO '11.

[57]  Stefano Cagnoni,et al.  Markerless Articulated Human Body Tracking from Multi-view Video with GPU-PSO , 2010, ICES.

[58]  Akila Gothandaraman,et al.  Comparing Hardware Accelerators in Scientific Applications: A Case Study , 2011, IEEE Transactions on Parallel and Distributed Systems.

[59]  Michael Garland,et al.  Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[60]  Shigeyoshi Tsutsui,et al.  Fast QAP solving by ACO with 2-opt local search on a GPU , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[61]  Xu Zhou,et al.  Parallel hybrid PSO with CUDA for lD heat conduction equation , 2015 .

[62]  David Johnson,et al.  Particle Swarm Optimization on a GPU , 2012, 2012 IEEE International Conference on Electro/Information Technology.

[63]  Mariusz Boryczka,et al.  The Parallel Ant Vehicle Navigation System with CUDA Technology , 2011, ICCCI.

[64]  Iain A. Stewart,et al.  Accelerating ant colony optimization-based edge detection on the GPU using CUDA , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[65]  Weichung Wang,et al.  Optimizing Latin hypercube designs by particle swarm , 2012, Statistics and Computing.

[66]  Václav Snásel,et al.  A brief survey of advances in Particle Swarm Optimization on Graphic Processing Units , 2013, 2013 World Congress on Nature and Biologically Inspired Computing.

[67]  Yuan Shi Reevaluating Amdahl's Law and Gustafson's Law , 1996 .

[68]  Shigeyoshi Tsutsui,et al.  ACO with tabu search on a GPU for solving QAPs using move-cost adjusted thread assignment , 2011, GECCO '11.

[69]  Tianyi David Han,et al.  Reducing branch divergence in GPU programs , 2011, GPGPU-4.

[70]  Philip Ross,et al.  Why CPU Frequency Stalled , 2008, IEEE Spectrum.

[71]  Koji Nakano,et al.  An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem , 2012, 2012 Third International Conference on Networking and Computing.

[72]  Helio J. C. Barbosa,et al.  Strategies for Parallel Ant Colony Optimization on Graphics Processing Units , 2013 .

[73]  Yue-Shan Chang,et al.  A parallel Bees Algorithm implementation on GPU , 2014, J. Syst. Archit..

[74]  Ruppa K. Thulasiram,et al.  Memory Efficient Multi-Swarm PSO Algorithm in OpenCL on an APU , 2013, ICA3PP.

[75]  Giancarlo Mauri,et al.  A GPU-Based Multi-swarm PSO Method for Parameter Estimation in Stochastic Biological Systems Exploiting Discrete-Time Target Series , 2012, EvoBIO.

[76]  Wen-Chih Peng,et al.  Particle Swarm Optimization With Recombination and Dynamic Linkage Discovery , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[77]  Fabio Daolio,et al.  GPU implementation of a road sign detector based on particle swarm optimization , 2010, Evol. Intell..

[78]  Iain A. Stewart,et al.  Improving Ant Colony Optimization performance on the GPU using CUDA , 2013, 2013 IEEE Congress on Evolutionary Computation.

[79]  Fabio Daolio,et al.  GPU-Based Road Sign Detection Using Particle Swarm Optimization , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[80]  William B. Langdon,et al.  A fast high quality pseudo random number generator for nVidia CUDA , 2009, GECCO '09.

[81]  Stefano Cagnoni,et al.  OpenCL Implementation of Particle Swarm Optimization: A Comparison between Multi-core CPU and GPU Performances , 2012, EvoApplications.

[82]  Arian Maghazeh,et al.  Pattern matching in OpenCL: GPU vs CPU energy consumption on two mobile chipsets , 2014, IWOCL '14.

[83]  Martyn Amos,et al.  Parallelization strategies for ant colony optimisation on GPUs , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[84]  M. Tarbouchi,et al.  Efficient parallel Particle Swarm Optimizers on GPU for real-time harmonic minimization in multilevel inverters , 2012, IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society.

[85]  Wang Jiening,et al.  Implementation of Ant Colony Algorithm Based on GPU , 2009, 2009 Sixth International Conference on Computer Graphics, Imaging and Visualization.

[86]  Laguna,et al.  Comparative Study of Parallel Variants for a Particle Swarm Optimization Algorithm Implemented on a Multithreading Gpu , 2009 .

[87]  Agustinus Kristiadi,et al.  PARALLEL PARTICLE SWARM OPTIMIZATION FOR IMAGE SEGMENTATION , 2013, DEIS 2013.

[88]  Bogdan Kwolek,et al.  GPU-Accelerated Human Motion Tracking Using Particle Filter Combined with PSO , 2013, ACIVS.

[89]  Thomas Nolte,et al.  GPGPU for industrial control systems , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[90]  Eliot Winer,et al.  Implementation of digital pheromones in PSO accelerated by commodity Graphics Hardware , 2008 .

[91]  Ruppa K. Thulasiram,et al.  Optimization of an OpenCL-Based Multi-swarm PSO Algorithm on an APU , 2013, PPAM.

[92]  Vincent Roberge,et al.  Parallel Particle Swarm Optimization on Graphical Processing Unit for Pose Estimation , 2012 .

[93]  Vinh Dang,et al.  Bio‐inspired optimization for electromagnetic structure design using full‐wave techniques on GPUs , 2013 .

[94]  Javier Jaén Martínez,et al.  Strategies for accelerating ant colony optimization algorithms on graphical processing units , 2007, 2007 IEEE Congress on Evolutionary Computation.

[95]  Ke Ding,et al.  Introduction to Fireworks Algorithm , 2013, Int. J. Swarm Intell. Res..

[96]  You Zhou,et al.  GPU-Based Parallel Multi-objective Particle Swarm Optimization , 2011 .

[97]  Tong Liu,et al.  The development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communications , 2011, Computer Science - Research and Development.

[98]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.

[99]  Ken A. Hawick,et al.  Parallel Parametric Optimisation with Firefly Algorithms on Graphical Processing Units , 2012 .

[100]  Julio Martín-Herrero,et al.  High performance GCP-based Particle Swarm Optimization of orthorectification of airborne pushbroom imagery , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[101]  Andries Petrus Engelbrecht,et al.  Fundamentals of Computational Swarm Intelligence , 2005 .

[102]  Miguel A. Vega-Rodríguez,et al.  Accelerating Particle Swarm Algorithm with GPGPU , 2011, 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[103]  Ke Ding,et al.  Comparison of random number generators in Particle Swarm Optimization algorithm , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[104]  Ying Tan,et al.  Advances in Swarm Intelligence , 2016, Lecture Notes in Computer Science.

[105]  Martyn Amos,et al.  Enhancing data parallelism for Ant Colony Optimization on GPUs , 2013, J. Parallel Distributed Comput..

[106]  Giancarlo Mauri,et al.  Estimating reaction constants in stochastic biological systems with a multi-swarm PSO running on GPUs , 2012, GECCO '12.

[107]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[108]  Oscar Castillo,et al.  Bio-inspired Optimization Methods on Graphic Processing Unit for Minimization of Complex Mathematical Functions , 2013, Recent Advances on Hybrid Intelligent Systems.

[109]  Jiankang Dong,et al.  Implementation of Ant Colony Algorithm Based on GPU , 2009, CGIV.

[110]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[111]  Guohua Zhou,et al.  A parallel Ant Colony Optimization algorithm with GPU-acceleration based on All-In-Roulette selection , 2010, Third International Workshop on Advanced Computational Intelligence.

[112]  Miodrag Potkonjak,et al.  Optimizing power using transformations , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[113]  David Kaeli,et al.  Heterogeneous Computing with OpenCL , 2011 .

[114]  Stefano Cagnoni,et al.  GPU-based asynchronous particle swarm optimization , 2011, GECCO '11.

[115]  Katya Rodríguez,et al.  A Parallel PSO Algorithm for a Watermarking Application on a GPU , 2013 .

[116]  Stefano Cagnoni,et al.  Particle Swarm Optimization and Differential Evolution for model-based object detection , 2013, Appl. Soft Comput..

[117]  Ruppa K. Thulasiram,et al.  Scheduling Using Multiple Swarm Particle Optimization with Memetic Features on Graphics Processing Units , 2013, Massively Parallel Evolutionary Computation on GPGPUs.