Model-Free Linear Discrete-Time System H∞ Control Using Input-Output Data

In this paper, a data-driven output feedback H∞ control algorithm for discrete-time linear systems with unknown dynamics has been proposed. At first, the standard model-based state feedback H∞ controller is formulated. Then, considering the state is not measurable and the system dynamic is unknown for some systems, we reformulate the value function of the H∞ control problem based on historical input and output data. Using the novel value function and adaptive dynamic programming approach, an online data-driven output feedback H∞ control algorithm for linear discrete-time systems is proposed. Different from on-policy reinforcement learning based model-free control algorithms, the proposed algorithm can eliminate the influence of probing noise to guarantee unbiased solutions. A simulation example is employed to verify the effectiveness of the proposed control algorithm.

[1]  Tamer Basar,et al.  H∞-Optimal Control and Related , 1991 .

[2]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[3]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[4]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[5]  Derong Liu,et al.  Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics , 2014, IEEE Transactions on Automation Science and Engineering.

[6]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[7]  Frank L. Lewis,et al.  Tracking Control for Linear Discrete-Time Networked Control Systems With Unknown Dynamics and Dropout , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  G. Zames Feedback and optimal sensitivity: Model reference transformations, multiplicative seminorms, and approximate inverses , 1981 .

[10]  Necati Özdemir,et al.  State-space solutions to standard H? control problem , 2002 .

[11]  Frank L. Lewis,et al.  Dual-Rate Operational Optimal Control for Flotation Industrial Process With Unknown Operational Model , 2019, IEEE Transactions on Industrial Electronics.

[12]  Frank L. Lewis,et al.  Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..

[13]  Frank L. Lewis,et al.  Data-Driven Flotation Industrial Process Operational Optimal Control Based on Reinforcement Learning , 2018, IEEE Transactions on Industrial Informatics.

[14]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Frank L. Lewis,et al.  Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data , 2015, IEEE Transactions on Cybernetics.