Employing reinforcement learning to enhance particle swarm optimization methods

Particle swarm optimization (PSO) is a well-known optimization algorithm that shows good performance in solving different optimization problems. However, PSO usually suffers from slow convergence. In this article, a reinforcement learning strategy is developed to enhance PSO in convergence by replacing the uniformly distributed random number in the updating function with a random number generated from a selected normal distribution. In the proposed method, the mean and standard deviation of the normal distribution are estimated from the current state of each individual through a policy net. The historic behaviour of the swarm group is used to update the policy net and guide the selection of parameters of the normal distribution. The proposed method is integrated into the original PSO and a state-of-the-art PSO, called the self-adaptive dynamic multi-swarm PSO (sDMS-PSO), and tested with numerical functions and engineering problems. The test results show that the convergence rate of PSO methods can be improved with the proposed reinforcement learning strategy.

[1]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[2]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[3]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[4]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[5]  Jitendra Malik,et al.  Learning to Optimize Neural Nets , 2017, ArXiv.

[6]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[7]  Chee Peng Lim,et al.  A new Reinforcement Learning-based Memetic Particle Swarm Optimizer , 2016, Appl. Soft Comput..

[8]  J. Kennedy,et al.  Neighborhood topologies in fully informed and best-of-neighborhood particle swarms , 2003, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  Visakan Kadirkamanathan,et al.  Stability analysis of the particle dynamics in particle swarm optimizer , 2006, IEEE Transactions on Evolutionary Computation.

[10]  Zhongzhi Shi,et al.  DMPSO: Diversity-Guided Multi-Mutation Particle Swarm Optimizer , 2019, IEEE Access.

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  Saman K. Halgamuge,et al.  Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients , 2004, IEEE Transactions on Evolutionary Computation.

[13]  Katja Verbeeck,et al.  A New Learning Hyper-heuristic for the Traveling Tournament Problem , 2009 .

[14]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[15]  Mustafa Servet Kiran,et al.  Particle swarm optimization with a new update mechanism , 2017, Appl. Soft Comput..

[16]  Xin-Ping Guan,et al.  Dynamic multi-swarm particle swarm optimizer with cooperative learning strategy , 2015, Appl. Soft Comput..

[17]  Slawomir Koziel,et al.  Computational Optimization, Methods and Algorithms , 2016, Computational Optimization, Methods and Algorithms.

[18]  Yue Xu,et al.  A reinforcement learning-based communication topology in particle swarm optimization , 2019, Neural Computing and Applications.

[19]  Chrysostomos D. Stylios,et al.  Integrating particle swarm optimization with reinforcement learning in noisy problems , 2012, GECCO '12.

[20]  Andrew Lewis,et al.  Grey Wolf Optimizer , 2014, Adv. Eng. Softw..

[21]  R Bellman,et al.  On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.

[22]  L. Guo,et al.  A self-adaptive dynamic particle swarm optimizer , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[23]  Li-Yeh Chuang,et al.  Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome , 2011, PloS one.

[24]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[25]  Rui Mendes,et al.  Neighborhood topologies in fully informed and best-of-neighborhood particle swarms , 2006 .

[26]  Greg F. Naterer,et al.  Collaboration pursuing method for multidisciplinary design optimization problems , 2007 .

[27]  Ioan Cristian Trelea,et al.  The particle swarm optimization algorithm: convergence analysis and parameter selection , 2003, Inf. Process. Lett..

[28]  Hamid R. Safavi,et al.  GuASPSO: a new approach to hold a better exploration–exploitation balance in PSO algorithm , 2019, Soft Computing.

[29]  Patrick De Causmaecker,et al.  Boosting Metaheuristic Search Using Reinforcement Learning , 2013, Hybrid Metaheuristics.

[30]  G. Gary Wang,et al.  Collaboration Pursuing Method for MDO Problems , 2005 .

[31]  R. Eberhart,et al.  Empirical study of particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[32]  Shahrel Azmin Suandi,et al.  Q-learning-based simulated annealing algorithm for constrained engineering design problems , 2019, Neural Computing and Applications.

[33]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[34]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.