Reinforcement Learning-based Control of Nonlinear Systems using Carleman Approximation: Structured and Unstructured Designs

We develop data-driven reinforcement learning (RL) control designs for input-affine nonlinear systems. We use Carleman linearization to express the state-space representation of the nonlinear dynamical model in the Carleman space, and develop a real-time algorithm that can learn nonlinear state-feedback controllers using state and input measurements in the infinite-dimensional Carleman space. Thereafter, we study the practicality of having a finite-order truncation of the control signal, followed by its closed-loop stability analysis. Finally, we develop two additional designs that can learn structured as well as sparse representations of the RL-based nonlinear controller, and provide theoretical conditions for ensuring their closed-loop stability. We present numerical examples to show how our proposed method generates closed-loop responses that are close to the optimal performance of the nonlinear plant. We also compare our designs to other data-driven nonlinear RL control methods such as those based on neural networks, and illustrate their relative advantages and drawbacks.

[1]  Duc M. Le,et al.  Deep Residual Neural Network (ResNet)-Based Adaptive Control: A Lyapunov-Based Approach , 2022, 2022 IEEE 61st Conference on Decision and Control (CDC).

[2]  M. Fazel,et al.  Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies , 2022, ArXiv.

[3]  A. Amini,et al.  Carleman Linearization of Nonlinear Systems and Its Finite-Section Approximations , 2022, ArXiv.

[4]  Warrren B Powell Reinforcement Learning and Stochastic Optimization , 2022 .

[5]  A. Chakrabortty,et al.  Scalable design methods for online data‐driven wide‐area control of power systems , 2021, IET Generation, Transmission & Distribution.

[6]  Aranya Chakrabortty,et al.  Sparse Nonlinear Wide-Area Control using Perturbed Koopman Mode Analysis , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[7]  Pietro Tesi,et al.  Data-Driven Stabilization of Nonlinear Polynomial Systems With Noisy Data , 2020, IEEE Transactions on Automatic Control.

[8]  Sayak Mukherjee,et al.  Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix , 2020, ArXiv.

[9]  Nader Motee,et al.  Approximate Optimal Control Design for a Class of Nonlinear Systems by Lifting Hamilton-Jacobi-Bellman Equation , 2020, 2020 American Control Conference (ACC).

[10]  Aranya Chakrabortty,et al.  Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations , 2020, Autom..

[11]  Aranya Chakrabortty,et al.  On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[12]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[13]  Aranya Chakrabortty,et al.  Structurally Constrained $\ell_{1}$-Sparse Control of Power Systems: Online Design and Resiliency Analysis , 2018, 2018 Annual American Control Conference (ACC).

[14]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming , 2017 .

[15]  Wotao Yin,et al.  On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[16]  Issa Amadou Tall,et al.  Linearization of control systems: A Lie series approach , 2014, Autom..

[17]  Steven L. Brunton,et al.  Dynamic Mode Decomposition with Control , 2014, SIAM J. Appl. Dyn. Syst..

[18]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[19]  Huai-Ning Wu,et al.  Online adaptive optimal control for bilinear systems , 2012, 2012 American Control Conference (ACC).

[20]  Fu Lin,et al.  Augmented Lagrangian Approach to Design of Structured Optimal State Feedback Gains , 2011, IEEE Transactions on Automatic Control.

[21]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[22]  Joaquín Collado,et al.  On a construction of a non-linear control law for non-linear systems through Carleman Bilinearization , 2010, 49th IEEE Conference on Decision and Control (CDC).

[23]  Frank L. Lewis,et al.  Online Synchronous Policy Iteration Method for Optimal Control , 2009 .

[24]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[25]  Kao-Shing Hwang,et al.  Reinforcement learning to adaptive control of nonlinear systems , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[26]  Zoran Gajic,et al.  Successive approximation procedure for steady-state optimal control of bilinear systems , 1995 .

[27]  W. Steeb,et al.  Nonlinear dynamical systems and Carleman linearization , 1991 .

[28]  J. C. Geromel,et al.  Structural Constrained Controllers for Linear Discrete Dynamic Systems , 1984 .

[29]  William L. Garrard,et al.  Design of nonlinear automatic flight control systems , 1977, Autom..

[30]  A. Amini,et al.  Carleman State Feedback Control Design of a Class of Nonlinear Control Systems , 2019, IFAC-PapersOnLine.

[31]  Babu Narayanan,et al.  POWER SYSTEM STABILITY AND CONTROL , 2015 .

[32]  P. Olver Nonlinear Systems , 2013 .

[33]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.