Stochastic optimal controller design for medium access constrained networked control systems with unknown dynamics

This paper proposes a stochastic optimal controller for networked control systems (NCS) with unknown dynamics and medium access constraints. The medium access constraint of NCS is modelled as a Markov Decision Process (MDP) that switches modes depending the channel access to the actuators. We then show that using the MDP assumption, the NCS with medium access constraint can be modelled as a Markovian jump linear system. Then a stochastic optimal controller is proposed that minimizes the quadratic cost function using Q-learning algorithm. The resulting control algorithm simultaneously optimizes the quadratic cost function and also allocates the network bandwidth judiciously by designing a scheduler. Two compensation strategies transmit zero and zero-order hold for control inputs that fail to get an access to channel are studied. The proposed controller and scheduler are illustrated using experiments on networks and simulations on an industrial four-tank system. The advantage of the proposed approach is that the optimal controller and scheduler can be designed forward-in-time for NCS with unknown dynamics. This is a departure from traditional dynamic programming based approaches that assume complete knowledge of the NCS dynamics and network constraints beforehand to solve the optimal controller problem backward-in-time.

[1]  Vafa Maihami,et al.  Distributed Learning Algorithm Applications to the Scheduling of Wireless Sensor Networks , 2014 .

[2]  Zhi-Hong Guan,et al.  Optimal tracking performance of MIMO control systems with communication constraints and a code scheme , 2015, Int. J. Syst. Sci..

[3]  Fouad A. Tobagi,et al.  Multiaccess Protocols in Packet Communication Systems , 1980, IEEE Trans. Commun..

[4]  Yang Wang,et al.  Instance-Specific Parameter Tuning for Meta-Heuristics , 2012 .

[5]  Yskandar Hamam,et al.  Optimal integrated control and scheduling of networked control systems with communication constraints: application to a car suspension system , 2006, IEEE Transactions on Control Systems Technology.

[6]  Albert Corominas,et al.  Pure and Hybrid Metaheuristics for the Response Time Variability Problem , 2013 .

[7]  B. Subathra,et al.  A comparative study of neuro fuzzy and recurrent neuro fuzzy model-based controllers for real-time industrial processes , 2015 .

[8]  Hao Xu,et al.  Stochastic Optimal Controller Design for Uncertain Nonlinear Networked Control System via Neuro Dynamic Programming , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Wei Zhang,et al.  Scheduling and feedback co-design for networked control systems , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[10]  Lei Zhang,et al.  Communication and control co-design for networked control systems , 2006, Autom..

[11]  D. Barrios-Aranibar,et al.  LEARNING FROM DELAYED REWARDS USING INFLUENCE VALUES APPLIED TO COORDINATION IN MULTI-AGENT SYSTEMS , 2007 .

[12]  João Pedro Hespanha,et al.  A Survey of Recent Results in Networked Control Systems , 2007, Proceedings of the IEEE.

[13]  Bo Lincoln,et al.  LQR optimization of linear system switching , 2002, IEEE Trans. Autom. Control..

[14]  Srini Ramaswamy,et al.  Adaptive LQR controller for Networked Control Systems subjected to random communication delays , 2013, 2013 American Control Conference.

[15]  Ge Guo A switching system approach to sensor and actuator assignment for stabilisation via limited multi-packet transmitting channels , 2011, Int. J. Control.

[16]  Hao Xu,et al.  Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses , 2012, Autom..

[17]  Lihua Xie,et al.  Optimal control of networked systems with limited communication: a combined heuristic and convex optimization approach , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[18]  Kouamana Bousson,et al.  Robust Control and Synchronization of Chaotic Systems with Actuator Constraints , 2015 .

[19]  John N. Tsitsiklis,et al.  Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..

[20]  Asok Ray,et al.  A Stochastic Regulator for Integrated Communication and Control Systems: Part I—Formulation of Control Law , 1991 .

[21]  Arben Çela,et al.  Structural Properties and Stabilization of NCS with Medium Access Constraints , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[22]  Andrey V. Savkin,et al.  The problem of LQG optimal control via a limited capacity communication channel , 2004, Syst. Control. Lett..

[23]  Luca Schenato,et al.  To Zero or to Hold Control Inputs With Lossy Links? , 2009, IEEE Transactions on Automatic Control.

[24]  F. Lewis,et al.  Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control , 2007, 2007 European Control Conference (ECC).

[25]  Victor M. Becerra,et al.  Optimal control , 2008, Scholarpedia.

[26]  Lei Zhang,et al.  LQG Control under Limited Communication , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[27]  Bin Yang,et al.  Networked Optimal Control with Random Medium Access Protocol and Packet Dropouts , 2015 .

[28]  Tai C Yang,et al.  Networked control system: a brief survey , 2006 .

[29]  Guo-Ping Liu,et al.  Integrated predictive control and scheduling co-design for networked control systems , 2008 .

[30]  Roger W. Brockett,et al.  Stabilization of motor networks , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[31]  Ge Guo,et al.  A switching system approach to actuator assignment with limited channels , 2009 .

[32]  W. Wonham On a Matrix Riccati Equation of Stochastic Control , 1968 .

[33]  B. Singh Evaluation of Genetic Algorithm as Learning System in Rigid Space Interpretation , 2014 .

[34]  Xi-Zhao Wang,et al.  Fuzzy Integral-Based Kernel Regression Ensemble and Its Application , 2015 .

[35]  Alex M. Andrew,et al.  ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).

[36]  Ge Guo,et al.  Control with a random access protocol and packet dropouts , 2016, Int. J. Syst. Sci..

[37]  Zibao Lu,et al.  Communications and control co-design: a combined dynamic-static scheduling approach , 2012, Science China Information Sciences.

[38]  Frank L. Lewis,et al.  Stochastic Optimal Design for Unknown Linear Discrete‐Time System Zero‐Sum Games in Input‐Output form Under Communication Constraints , 2014 .

[39]  Dimitrios Hristu-Varsakelis On the period of communication policies for networked control systems, and the question of zero-order holding , 2007, 2007 46th IEEE Conference on Decision and Control.