Adaptive dynamic programming-based state quantized networked control system without value and/or policy iterations

In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dynamic quantizer for the sensor measurements is proposed which essentially provides system states to the controller. To eliminate the effect of the quantization error, the dynamics of the quantization error bound and an update law for tuning its range are derived. Subsequently, by using adaptive dynamic programming technique, the infinite horizon optimal regulation of the uncertain NCS is solved in a forward-in-time manner without using value and/or policy iterations by using Q-function and reinforcement learning. The asymptotic stability of the closed-loop system is verified by standard Lyapunov stability theory. Finally, the effectiveness of the proposed method is verified by simulation results.

[1]  T. Dierks,et al.  Optimal control of affine nonlinear discrete-time systems , 2009, 2009 17th Mediterranean Conference on Control and Automation.

[2]  Björn Wittenmark,et al.  Stochastic Analysis and Control of Real-time Systems with Random Time Delays , 1999 .

[3]  U. Ozguner,et al.  Stability of Linear Feedback Systems with Random Communication Delays , 1991, 1991 American Control Conference.

[4]  Nicola Elia,et al.  Stabilization of linear systems with limited information , 2001, IEEE Trans. Autom. Control..

[5]  Sarangapani Jagannathan,et al.  Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Hao Xu,et al.  Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses , 2012, Autom..

[7]  John B. Moore,et al.  Persistence of Excitation in Linear Systems , 1985, 1985 American Control Conference.

[8]  Sarangapani Jagannathan,et al.  Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamics , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[9]  John B. Moore,et al.  Persistence of Excitation in Linear Systems , 1985, 1985 American Control Conference.

[10]  Daniel Liberzon,et al.  Hybrid feedback stabilization of systems with quantized signals , 2003, Autom..

[11]  Asok Ray,et al.  A Stochastic Regulator for Integrated Communication and Control Systems: Part I—Formulation of Control Law , 1991 .

[12]  Richard S. Sutton,et al.  A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[13]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[14]  Wei Zhang,et al.  Stability of networked control systems , 2001 .