Adaptive optimal control for linear stochastic systems with additive noise

In this paper, an optimal control design scheme is proposed for continuous-time linear stochastic systems with unknown dynamics. Both signal-dependent noise and additive noise are considered. A non-model based optimal control design methodology is employed to iteratively update the control policy online by using the system state and input information. A new adaptive dynamic programming algorithm is developed, and the convergence result for the proposed methods is presented. The effectiveness of the obtained method is also illustrated by a practical simulation example of 2-DOF vehicle suspension control system.

[1]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Frank L. Lewis,et al.  Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming as a theory of sensorimotor control , 2012, 2012 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[4]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming With an Application to Power Systems , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[6]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[7]  Zhong-Ping Jiang,et al.  Robust adaptive dynamic programming for continuous-time linear stochastic systems , 2014, 2014 IEEE International Symposium on Intelligent Control (ISIC).

[8]  Derong Liu,et al.  Adaptive Dynamic Programming for Control: Algorithms and Stability , 2012 .

[9]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[10]  Karl Johan Åström,et al.  Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.

[11]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[12]  Zhong-Ping Jiang,et al.  Decentralized Adaptive Optimal Control of Large-Scale Systems With Application to Power Systems , 2015, IEEE Transactions on Industrial Electronics.

[13]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..

[14]  Zhong-Ping Jiang,et al.  Approximate Dynamic Programming for Optimal Stationary Control With Control-Dependent Noise , 2011, IEEE Transactions on Neural Networks.

[15]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[16]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[17]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  R. Godson Elements of intelligence , 1979 .

[20]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[21]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[22]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[23]  H. Kushner Stochastic Stability and Control , 2012 .

[24]  D. Kleinman,et al.  Optimal stationary control of linear systems with control-dependent noise , 1969 .

[25]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[26]  D. Williams STOCHASTIC DIFFERENTIAL EQUATIONS: THEORY AND APPLICATIONS , 1976 .

[27]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[28]  Jan C. Willems,et al.  Feedback stabilizability for stochastic systems with state and control dependent noise , 1976, Autom..

[29]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[30]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[31]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[32]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[33]  H. J. Kushner,et al.  Optimal Discounted Stochastic Control for Diffusion Processes , 1967 .

[34]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[35]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[36]  A. Hać SUSPENSION OPTIMIZATION OF A 2-DOF VEHICLE MODEL USING A STOCHASTIC OPTIMAL CONTROL TECHNIQUE , 1985 .

[37]  John N. Tsitsiklis,et al.  Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..

[38]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[39]  R. Khasminskii Stochastic Stability of Differential Equations , 1980 .