A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning

The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control law, respectively. For avoiding a large number of trials and improving the stability, the training of action network introduces supervised learning mechanisms into reduction of long-term cost. The stability of the control system with learning algorithm is analyzed; the upper bound of the tracking error and neural network weights are also estimated. The effectiveness of our proposed controller is illustrated in the simulation; the results indicate the stability under communication delay and disturbances as well.

[1]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[2]  Alberto Bemporad,et al.  Networked Control Systems , 2010 .

[3]  Ke Zhang,et al.  UPFCs control design for avoiding generator trip of electric power grid with barrier function , 2015 .

[4]  Jian Sun,et al.  L-infinity event-triggered networked control under time-varying communication delay with communication cost reduction , 2015, Journal of the Franklin Institute.

[5]  Dan Wang,et al.  Distributed model reference adaptive control for cooperative tracking of uncertain dynamical multi‐agent systems , 2013, IET Control Theory & Applications.

[6]  Housheng Su,et al.  Adaptive cluster synchronisation of coupled harmonic oscillators with multiple leaders , 2013 .

[7]  W. Zhang,et al.  Observer-based adaptive consensus tracking for linear multi-agent systems with input saturation , 2015 .

[8]  Yang Li,et al.  Adaptive Neural Network Control of AUVs With Control Input Nonlinearities Using Reinforcement Learning , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[9]  Huaguang Zhang,et al.  Adaptive Fault-Tolerant Tracking Control for MIMO Discrete-Time Systems via Reinforcement Learning Algorithm With Less Learning Parameters , 2017, IEEE Transactions on Automation Science and Engineering.

[10]  Tingwen Huang,et al.  High-Performance Consensus Control in Networked Systems With Limited Bandwidth Communication and Time-Varying Directed Topologies , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Yuezu Lv,et al.  Distributed adaptive consensus protocols for linearly coupled Lur'e systems over a directed topology , 2017 .

[12]  Douglas C. Hittle,et al.  Robust Reinforcement Learning Control Using Integral Quadratic Constraints for Recurrent Neural Networks , 2007, IEEE Transactions on Neural Networks.

[13]  Jae Young Lee,et al.  Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Haibo He,et al.  Kernel-Based Approximate Dynamic Programming for Real-Time Online Learning Control: An Experimental Study , 2014, IEEE Transactions on Control Systems Technology.

[15]  J. Sarangapani Neural Network Control of Nonlinear Discrete-Time Systems (Public Administration and Public Policy) , 2006 .

[16]  Wei Xing Zheng,et al.  Robust $H_{\infty }$ Group Consensus for Interacting Clusters of Integrator Agents , 2017, IEEE Transactions on Automatic Control.

[17]  Zhongke Shi,et al.  Reinforcement Learning Output Feedback NN Control Using Deterministic Learning Technique , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Jian Sun,et al.  A direct method for power system corrective control to relieve current violation in transient with UPFCs by barrier functions , 2016 .

[19]  Victor R. Lesser,et al.  Efficient multi-agent reinforcement learning through automated supervision , 2008, AAMAS.