Neuroevolution with CMA-ES for Real-time Gain Tuning of a Car-like Robot Controller

This paper proposes a method for dynamically varying the gains of a mobile robot controller that takes into account, not only errors to the reference trajectory but also the uncertainty in the localisation. To do this, the covariance matrix of a state observer is used to indicate the precision of the perception. CMA-ES, an evolutionary algorithm is used to train a neural network that is capable of adapting the robot’s behaviour in real-time. Using a car-like vehicle model in simulation. Promising results show significant trajectory following performances improvements thanks to control gains fluctuations by using this new method. Simulations demonstrate the capability of the system to control the robot in complex environments, in which classical static controllers could not guarantee a stable behaviour.

[1]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[2]  W. Luyben,et al.  Tuning PI controllers for integrator/dead time processes , 1992 .

[3]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[4]  Katerina Marova Using CMA-ES for tuning coupled PID controllers within models of combustion engines , 2016, ArXiv.

[5]  Pierre-Yves Oudeyer,et al.  Sim-to-Real Transfer with Neural-Augmented Robot Simulation , 2018, CoRL.

[6]  Marian Popescu,et al.  PID Controller optimal tuning , 2016, 2016 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI).

[7]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[8]  Mariano De Paula,et al.  Incremental Q-learning strategy for adaptive PID control of mobile robots , 2017, Expert Syst. Appl..

[9]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[10]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[11]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[12]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[13]  Anne Auger,et al.  Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009 , 2010, GECCO '10.

[14]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[15]  Akira Inoue,et al.  Support vector machine-based two-wheeled mobile robot motion control in a noisy environment , 2008 .

[16]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[17]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[18]  Benoît Thuilot,et al.  Robust sideslip angles observer for accurate off-road path tracking control , 2017, Adv. Robotics.

[19]  Fei Wang,et al.  Adaptive PID Controller Based on BP Neural Network , 2009, 2009 International Joint Conference on Artificial Intelligence.

[20]  K. Tanaka,et al.  PID controller tuning based on the covariance matrix adaptation evolution strategy , 2009, 2009 ICCAS-SICE.

[21]  Mark W. Spong,et al.  Robustness of adaptive control of robots: Theory and experiment , 1991 .

[22]  Weng Khuen Ho,et al.  Performance and gain and phase margins of well-known PID tuning formulas , 1995, IEEE Trans. Control. Syst. Technol..

[23]  S. Baskar,et al.  Design Of Multivariable Fractional Order Pid Controller Using Covariance Matrix Adaptation Evolution Strategy , 2014 .

[24]  Luc Jaulin Mobile Robotics , 2019 .

[25]  Julian Togelius,et al.  Neuroevolution in Games: State of the Art and Open Challenges , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[26]  Laleh Jalali,et al.  Maintenance of robot's equilibrium in a noisy environment with fuzzy controller , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[27]  Xiaogang Wang,et al.  Identification of multivariate system based on PID neural network , 2015, 2015 Sixth International Conference on Intelligent Control and Information Processing (ICICIP).

[28]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[29]  Karl Johan Åström,et al.  Relay Feedback Auto-tuning of Process Controllers – A Tutorial Review , 2002 .