Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise

In this technical note, the adaptive optimal control problem is investigated for a class of continuous-time stochastic systems subject to multiplicative noise. A novel non-model-based optimal control design methodology is employed to iteratively update the control policy on-line by using directly the data of the system state and input. Both adaptive dynamic programming (ADP) and robust ADP algorithms are developed, along with rigorous stability and convergence analysis. The effectiveness of the obtained methods is illustrated by an example arising from biological sensorimotor control.

[1]  R. Godson Elements of intelligence , 1979 .

[2]  Zhong-Ping Jiang,et al.  Decentralized adaptive output-feedback stabilization for large-scale stochastic nonlinear systems , 2007, Autom..

[3]  Yu Jiang,et al.  Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Zhong-Ping Jiang,et al.  Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems , 2013, IEEE Transactions on Automatic Control.

[5]  V. Dragan,et al.  A small gain theorem for linear stochastic systems , 1997 .

[6]  U. Shaked,et al.  H ∞ -Like Control for Nonlinear Stochastic Systems , 2006 .

[7]  Zhong-Ping Jiang,et al.  Approximate Dynamic Programming for Optimal Stationary Control With Control-Dependent Noise , 2011, IEEE Transactions on Neural Networks.

[8]  D. Hinrichsen,et al.  Stochastic $H^\infty$ , 1998 .

[9]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[10]  Miroslav Krstic,et al.  Stabilization of Nonlinear Uncertain Systems , 1998 .

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[13]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[14]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[15]  D. Kleinman,et al.  Optimal stationary control of linear systems with control-dependent noise , 1969 .

[16]  Zhong-Ping Jiang,et al.  Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design , 2016, Autom..

[17]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[18]  P. McLane Optimal stochastic control of linear systems with state- and control-dependent disturbances , 1971 .

[19]  Zhong-Ping Jiang,et al.  New results in global stabilization for stochastic nonlinear systems , 2016 .

[20]  Bor-Sen Chen,et al.  Stochastic H2/H∞ control with state-dependent noise , 2004, IEEE Trans. Autom. Control..

[21]  Zhong-Ping Jiang,et al.  Small-gain theorem for ISS systems and applications , 1994, Math. Control. Signals Syst..

[22]  Uri Shaked,et al.  H∞-like control for nonlinear stochastic systems , 2006, Syst. Control. Lett..

[23]  Zhong-Ping Jiang,et al.  Robust adaptive dynamic programming for linear and nonlinear systems: An overview , 2013, Eur. J. Control.

[24]  Frank L. Lewis,et al.  Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Derong Liu,et al.  Adaptive Dynamic Programming for Control: Algorithms and Stability , 2012 .

[26]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[27]  Haibo He,et al.  Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming as a theory of sensorimotor control , 2012, 2012 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[29]  M. Kawato,et al.  Functional significance of stiffness in adaptation of multijoint arm movements to stable and unstable dynamics , 2003, Experimental Brain Research.

[30]  D. Williams STOCHASTIC DIFFERENTIAL EQUATIONS: THEORY AND APPLICATIONS , 1976 .

[31]  Avimanyu Sahoo,et al.  Near Optimal Event-Triggered Control of Nonlinear Discrete-Time Systems Using Neurodynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[32]  D. Hinrichsen,et al.  Stochastic H∞ , 1998 .

[33]  Emanuel Todorov,et al.  Evidence for the Flexible Sensorimotor Strategies Predicted by Optimal Feedback Control , 2007, The Journal of Neuroscience.

[34]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..

[35]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[36]  Zhong-Ping Jiang,et al.  Decentralized Adaptive Optimal Control of Large-Scale Systems With Application to Power Systems , 2015, IEEE Transactions on Industrial Electronics.

[37]  Frank L. Lewis,et al.  Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.

[38]  Yungang Liu,et al.  Adaptive state-feedback stabilization for a class of high-order nonlinear uncertain systems , 2007, Autom..

[39]  Gang Tao,et al.  Multivariable adaptive control: A survey , 2014, Autom..

[40]  Jan C. Willems,et al.  Feedback stabilizability for stochastic systems with state and control dependent noise , 1976, Autom..