Synergism of Firefly Algorithm and Q-Learning for Robot Arm Path Planning

Abstract Over the past few decades, Firefly Algorithm (FA) has attracted the attention of many researchers by virtue of its capability of solving complex real-world optimization problems. The only factor restricting the efficiency of this FA algorithm is the need of having balanced exploration and exploitation while searching for the global optima in the search-space. This balance can be established by tuning the two inherent control parameters of FA. One is the randomization parameter and another is light absorption coefficient, over iterations, either experimentally or by an automatic adaptive strategy. This paper aims at the later by proposing an improvised FA which involves the Q-learning framework within itself. In this proposed Q-learning induced FA (QFA), the optimal parameter values for each firefly of a population are learnt by the Q-learning strategy during the learning phase and applied thereafter during execution. The proposed algorithm has been simulated on fifteen benchmark functions suggested in the CEC 2015 competition. In addition, the proposed algorithm's superiority is tested by conducting the Friedman test, Iman–Davenport and Bonferroni Dunn test. Moreover, its suitability for application in real-world constrained environments has been examined by employing the algorithm in the path planning of a robotic manipulator amidst various obstacles. To avoid obstacles one mechanism is designed for the robot-arm. The results, obtained from both simulation and real-world experiment, confirm the superiority of the proposed QFA over other contender algorithms in terms of solution quality as well as run-time complexity.

[1]  Amiya Nayak,et al.  Fault identification with binary adaptive fireflies in parallel and distributed systems , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[2]  Xin-She Yang,et al.  Optimization and Metaheuristic Algorithms in Engineering , 2013 .

[3]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[4]  Robert G. Reynolds,et al.  A differential evolution algorithm with success-based parameter adaptation for CEC2015 learning-based optimization , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[5]  Yaochu Jin,et al.  A social learning particle swarm optimization algorithm for scalable optimization , 2015, Inf. Sci..

[6]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[7]  Xin-She Yang,et al.  Firefly Algorithm, Lévy Flights and Global Optimization , 2010, SGAI Conf..

[8]  Vimal J. Savsani,et al.  Optimized trajectory planning of a robotic arm using teaching learning based optimization (TLBO) and artificial bee colony (ABC) optimization techniques , 2013, 2013 IEEE International Systems Conference (SysCon).

[9]  Xin-She Yang,et al.  Firefly algorithm with chaos , 2013, Commun. Nonlinear Sci. Numer. Simul..

[10]  Yu-Jun Zheng,et al.  Tuning maturity model of ecogeography-based optimization on CEC 2015 single-objective optimization test problems , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[11]  Marco Dorigo,et al.  The ant colony optimization meta-heuristic , 1999 .

[12]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[13]  M. D. Bennett,et al.  Robotics and Control , 1990 .

[14]  Lakhmi C. Jain,et al.  Soft Computing Applications - Proceedings of the 6th International Workshop Soft Computing Applications, SOFA 2014, Volume 1, Timisoara, Romania, 24-26 July 2014 , 2016, SOFA.

[15]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[16]  Ken A. Hawick,et al.  Parallel Parametric Optimisation with Firefly Algorithms on Graphical Processing Units , 2012 .

[17]  Xin-She Yang,et al.  Metaheuristic Optimization: Algorithm Analysis and Open Problems , 2011, SEA.

[18]  Amiya Nayak,et al.  Network Fault Diagnosis: An Artificial Immune System Approach , 2008, 2008 14th IEEE International Conference on Parallel and Distributed Systems.

[19]  Narasimhan Sundararajan,et al.  Self regulating particle swarm optimization algorithm , 2015, Inf. Sci..

[20]  Fuad E. Alsaadi,et al.  A Novel Switching Delayed PSO Algorithm for Estimating Unknown Parameters of Lateral Flow Immunoassay , 2016, Cognitive Computation.

[21]  Leandro dos Santos Coelho,et al.  A chaotic firefly algorithm applied to reliability-redundancy optimization , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[22]  Ruhul A. Sarker,et al.  Neurodynamic differential evolution algorithm and solving CEC2015 competition problems , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[23]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[24]  Amit Konar,et al.  Improving the speed of convergence of multi-agent Q-learning for cooperative task-planning by a robot-team , 2017, Robotics Auton. Syst..

[25]  Jason Sheng-Hong Tsai,et al.  A self-optimization approach for L-SHADE incorporated with eigenvector-based crossover and successful-parent-selecting framework on CEC 2015 benchmark set , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[26]  Amit Konar,et al.  A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[27]  Surafel Luleseged Tilahun,et al.  Modified Firefly Algorithm , 2012, J. Appl. Math..

[28]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[29]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[30]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[31]  Mohammad Reza Meybodi,et al.  A Gaussian Firefly Algorithm , 2011 .

[32]  Yu Xue,et al.  A self-adaptive artificial bee colony algorithm based on global best for global optimization , 2017, Soft Computing.

[33]  L. Guo,et al.  A self-adaptive dynamic particle swarm optimizer , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[34]  Guido Herrmann,et al.  A novel robust adaptive control algorithm with finite-time online parameter estimation of a humanoid robot arm , 2014, Robotics Auton. Syst..

[35]  Xiaodong Li,et al.  Swarm Intelligence in Optimization , 2008, Swarm Intelligence.

[36]  Xin-She Yang,et al.  Nature-Inspired Metaheuristic Algorithms , 2008 .

[37]  Janez Brest,et al.  A comprehensive review of firefly algorithms , 2013, Swarm Evol. Comput..

[38]  Yang Liu,et al.  A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[39]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[40]  Marjan Mernik,et al.  Analysis of exploration and exploitation in evolutionary algorithms by ancestry trees , 2011 .

[41]  James Kennedy,et al.  The particle swarm: social adaptation of knowledge , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[42]  Pratyusha Rakshit,et al.  Adaptive Firefly Algorithm for nonholonomic motion planning of car-like system , 2013, 2013 IEEE Congress on Evolutionary Computation.

[43]  Harish Sharma,et al.  Lévy flight artificial bee colony algorithm , 2016, Int. J. Syst. Sci..

[44]  Xin-She Yang,et al.  A New Metaheuristic Bat-Inspired Algorithm , 2010, NICSO.