Learning‐based iterative modular adaptive control for nonlinear systems

In this paper we study the problem of adaptive trajectory tracking control for a class of nonlinear systems with structured parametric uncertainties. We propose to use an iterative modular approach: we first design a robust nonlinear state feedback that renders the closed-loop inputto-state stable ISS). Here, the input is considered to be the estimation error of the uncertain parameters, and the state is considered to be the closed loop output tracking error. Next, we propose an iterative adaptive algorithm, where we augment this robust ISS controller with an iterative data-driven learning algorithm to estimate online the parametric uncertainties of the model. We implement this method with two different learning approaches. The first one is a datadriven multi-parametric extremum seeking (MES) method, which guarantees local convergence results, and the second is a Bayesian optimization-based method called Gaussian Process Upper Confidence Bound (GPUCB), which guarantees global results in a compact search set. The combination of the ISS feedback and the data-driven learning algorithms gives a learning-based modular indirect adaptive controller. We show the efficiency of this approach on a two-link robot manipulator numerical example. International Journal of adaptive control and signal processing This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c © Mitsubishi Electric Research Laboratories, Inc., 2018 201 Broadway, Cambridge, Massachusetts 02139 INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING Int. J. Adapt. Control Signal Process. 0000; 00:1–30 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/acs Learning-Based Iterative Modular Adaptive Control for Nonlinear Systems Mouhacine Benosman, Amir-massoud Farahmand, Meng Xia Mitsubishi Electric Research Laboratories, 201 Broadway Street, Cambridge, MA 02139, USA (Email: m benosman@ieee.org),Mathworks, USA.

[1]  A.G. Alleyne,et al.  A survey of iterative learning control , 2006, IEEE Control Systems.

[2]  Meng Xia,et al.  Extremum Seeking-based Indirect Adaptive Control for Nonlinear Systems with State and Time-Dependent Uncertainties , 2015, ArXiv.

[3]  Jianliang Wang,et al.  Nonlinear Control Allocation for Non-Minimum Phase Systems , 2009, IEEE Transactions on Control Systems Technology.

[4]  Ioan Doré Landau,et al.  Adaptive and robust active vibration control , 2017 .

[5]  Chunlei Zhang,et al.  Extremum-Seeking Control and Applications: A Numerical Optimization-Based Approach , 2011 .

[6]  Kevin M. Passino,et al.  Stable Adaptive Control and Estimation for Nonlinear Systems , 2001 .

[7]  Denis Dochain,et al.  Flatness-Based Extremum-Seeking Control Over Periodic Orbits , 2007, IEEE Transactions on Automatic Control.

[8]  M. Guay,et al.  ADAPTIVE EXTREMUM SEEKING CONTROL OF NONLINEAR DYNAMIC SYSTEMS WITH PARAMETRIC UNCERTAINTIES , 2002 .

[9]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[10]  Jae Young Lee,et al.  Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..

[11]  Kartik B. Ariyur,et al.  Adaptive feedback linearization of nonlinear MIMO systems using ES-MRAC , 2013, 2013 American Control Conference.

[12]  Gang Tao,et al.  Multivariable adaptive control: A survey , 2014, Autom..

[13]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[14]  Denis Dochain,et al.  A multi-objective extremum-seeking controller design technique , 2015, Int. J. Control.

[15]  Meng Xia,et al.  Extremum seeking-based indirect adaptive control for nonlinear systems with time-varying uncertainties , 2015, 2015 European Control Conference (ECC).

[16]  W. Dixon Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles , 2014 .

[17]  Gökhan M. Atinç,et al.  Multi-parametric extremum seeking-based learning control for electromagnetic actuators , 2013, 2013 American Control Conference.

[18]  Mario A. Rotea,et al.  Analysis of multivariable extremum seeking algorithms , 2000, Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334).

[19]  Rémi Munos,et al.  From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning , 2014, Found. Trends Mach. Learn..

[20]  Mouhacine Benosman,et al.  Extremum Seeking-based Iterative Learning Model Predictive Control (ESILC-MPC) , 2015, ArXiv.

[21]  Leszek Koszalka,et al.  An Idea of Using Reinforcement Learning in Adaptive Control Systems , 2006, International Conference on Networking, International Conference on Systems and International Conference on Mobile Communications and Learning Technologies (ICNICONSMCL'06).

[22]  Poorya Haghi,et al.  On the extremum seeking of model reference adaptive control in higher-dimensional systems , 2011, Proceedings of the 2011 American Control Conference.

[23]  Miroslav Krstic,et al.  Model-Free Stabilization by Extremum Seeking , 2016 .

[24]  Martin Guay,et al.  Extremum-seeking control of state-constrained nonlinear systems , 2004, Autom..

[25]  Mouhacine Benosman,et al.  Multi‐parametric extremum seeking‐based iterative feedback gains tuning for nonlinear control , 2016 .

[26]  Miroslav Krstic,et al.  Finite-horizon LQ control for unknown discrete-time linear systems via extremum seeking , 2012, CDC.

[27]  Mouhacine Benosman,et al.  Bayesian Optimization-based Modular Indirect Adaptive Control for a Class of Nonlinear Systems , 2016 .

[28]  Frank L. Lewis,et al.  Adaptive Optimal Control of Partially-unknown Constrained-input Systems using Policy Iteration with Experience Replay , 2013 .

[29]  Gökhan M. Atinç,et al.  Nonlinear backstepping learning-based adaptive control of electromagnetic actuators with proof of stability , 2013, 52nd IEEE Conference on Decision and Control.

[30]  Tao Zhang,et al.  Adaptive extremum seeking control of nonlinear dynamic systems with parametric uncertainties , 2003, Autom..

[31]  Ying Tan,et al.  On non-local stability properties of extremum seeking control , 2006, Autom..

[32]  M. Krstić,et al.  Real-Time Optimization by Extremum-Seeking Control , 2003 .

[33]  Michael Malisoff,et al.  Further remarks on strict input-to-state stable Lyapunov functions for time-varying systems , 2005, Autom..

[34]  Tansel Yucelen,et al.  On transient performance improvement of adaptive control architectures , 2015, Int. J. Control.

[35]  David J. Hill,et al.  Deterministic Learning Theory , 2009 .

[36]  Meng Xia,et al.  Learning-based modular indirect adaptive control for a class of nonlinear systems , 2015, 2016 American Control Conference (ACC).

[37]  Denis Dochain,et al.  A time-varying extremum-seeking control approach , 2013, ACC.

[38]  Gökhan M. Atinç,et al.  Non-linear adaptive control for electromagnetic actuators , 2015 .

[39]  Miroslav Krstic,et al.  Nonlinear and adaptive control de-sign , 1995 .

[40]  Gökhan M. Atinç,et al.  Nonlinear learning-based adaptive control for electromagnetic actuators , 2013, 2013 European Control Conference (ECC).

[41]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[42]  Shuzhi Sam Ge,et al.  An ISS-modular approach for adaptive neural control of pure-feedback systems , 2006, Autom..

[43]  Alex M. Andrew,et al.  ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).

[44]  Martin Guay,et al.  A Perturbation-Based Proportional Integral Extremum-Seeking Control Approach , 2016, IEEE Transactions on Automatic Control.

[45]  Stefano Di Cairano,et al.  Extremum seeking-based iterative learning linear MPC , 2014, 2014 IEEE Conference on Control Applications (CCA).

[46]  Mouhacine Benosman,et al.  Learning-based adaptive control for nonlinear systems , 2014, 2014 European Control Conference (ECC).

[47]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[48]  Zhong-Ping Jiang,et al.  Robust adaptive dynamic programming for linear and nonlinear systems: An overview , 2013, Eur. J. Control.

[49]  Rui Yan,et al.  On initial conditions in iterative learning control , 2005, 2006 American Control Conference.

[50]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[51]  Gökhan M. Atinç,et al.  Extremum seeking-based adaptive control for electromagnetic actuators , 2015, Int. J. Control.

[52]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[53]  Shangtai Jin,et al.  A Novel Data-Driven Control Approach for a Class of Discrete-Time Nonlinear Systems , 2011, IEEE Transactions on Control Systems Technology.

[54]  Ying Tan,et al.  On the Choice of Dither in Extremum Seeking Systems: a Case Study , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[55]  Martin Guay,et al.  A minmax extremum-seeking controller design technique , 2017 .

[56]  Hakan Elmali,et al.  Robust output tracking control of nonlinear MIMO systems via sliding mode technique , 1992, Autom..

[57]  Kemao Ma,et al.  On stability and application of extremum seeking control without steady-state oscillation , 2016, Autom..

[58]  Mouhacine Benosman,et al.  Multi-parametric extremum seeking-based auto-tuning for robust Input-Output linearization control , 2014, 53rd IEEE Conference on Decision and Control.