Feedback Linearization for Uncertain Systems via Reinforcement Learning

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant linear under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.

[1]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[2]  R. Murray,et al.  Flat Systems , 1997 .

[3]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[4]  S. Shankar Sastry,et al.  Feedback Linearization for Unknown Systems via Reinforcement Learning , 2019, ArXiv.

[5]  Alberto Bemporad,et al.  Predictive Control for Linear and Hybrid Systems , 2017 .

[6]  R. E. Kalman,et al.  Contributions to the Theory of Optimal Control , 1960 .

[7]  Vijay Kumar,et al.  Minimum snap trajectory generation and control for quadrotors , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Sandra Hirche,et al.  Feedback linearization using Gaussian processes , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[9]  Saif A. Al-Hiddabi,et al.  Quadrotor control using feedback linearization with dynamic extension , 2009, 2009 6th International Symposium on Mechatronics and its Applications.

[10]  C. A. Desoer,et al.  Nonlinear Systems Analysis , 1978 .

[11]  A. Isidori,et al.  Adaptive control of linearizable systems , 1989 .

[12]  Charalampos P. Bechlioulis,et al.  Robust Adaptive Control of Feedback Linearizable MIMO Nonlinear Systems With Prescribed Performance , 2008, IEEE Transactions on Automatic Control.

[13]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[14]  Hassan K. Khalil,et al.  Adaptive control of a class of nonlinear discrete-time systems using neural networks , 1995, IEEE Trans. Autom. Control..

[15]  Kevin M. Passino,et al.  Stable adaptive control using fuzzy systems and neural networks , 1996, IEEE Trans. Fuzzy Syst..

[16]  P. Kokotovic,et al.  Feedback linearization of sampled-data systems , 1988 .

[17]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[18]  I. Kanellakopoulos,et al.  Systematic Design of Adaptive Controllers for Feedback Linearizable Systems , 1991, 1991 American Control Conference.

[19]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[20]  Herman Bruyninckx,et al.  Open robot control software: the OROCOS project , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[21]  R. Murray,et al.  Flat systems, equivalence and trajectory generation , 2003 .

[22]  Franck Plestan,et al.  Asymptotically stable walking for biped robots: analysis via systems with impulse effects , 2001, IEEE Trans. Autom. Control..

[23]  Elias B. Kosmatopoulos,et al.  A switching adaptive controller for feedback linearizable systems , 1999, IEEE Trans. Autom. Control..

[24]  S. Sastry,et al.  Adaptive Control: Stability, Convergence and Robustness , 1989 .

[25]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[26]  Nahum Shimkin,et al.  Nonlinear Control Systems , 2008 .

[27]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[29]  Jonathan P. How,et al.  Bayesian nonparametric adaptive control of time-varying systems using Gaussian processes , 2013, 2013 American Control Conference.

[30]  Frank L. Lewis,et al.  Feedback linearization using neural networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[31]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[32]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[33]  Jonathan P. How,et al.  Bayesian Nonparametric Adaptive Control Using Gaussian Processes , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Kwanghee Nam,et al.  A model reference adaptive control scheme for pure-feedback nonlinear systems , 1987, 1987 American Control Conference.

[35]  Sandra Hirche,et al.  Feedback Linearization Based on Gaussian Processes With Event-Triggered Online Learning , 2019, IEEE Transactions on Automatic Control.

[36]  M.,et al.  Chaos in a double pendulum , 2004 .

[37]  Koushil Sreenath,et al.  Rapidly Exponentially Stabilizing Control Lyapunov Functions and Hybrid Zero Dynamics , 2014, IEEE Transactions on Automatic Control.

[38]  Elias B. Kosmatopoulos,et al.  Robust switching adaptive control of multi-input nonlinear systems , 2002, IEEE Trans. Autom. Control..

[39]  S. Shankar Sastry,et al.  Adaptive Control of Mechanical Manipulators , 1987, Proceedings. 1986 IEEE International Conference on Robotics and Automation.