A model-free robust policy iteration algorithm for optimal control of nonlinear systems
暂无分享,去创建一个
[1] Paul J. Webros. A menu of designs for reinforcement learning over time , 1990 .
[2] Miroslav Krstic,et al. Nonlinear and adaptive control de-sign , 1995 .
[3] Alexander S. Poznyak,et al. Differential Neural Networks for Robust Nonlinear Control: Identification, State Estimation and Trajectory Tracking , 2001 .
[4] Weiping Li,et al. Applied Nonlinear Control , 1991 .
[5] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.
[6] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[7] Warren E. Dixon,et al. Nonlinear Control of Engineering Systems , 2002 .
[8] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[9] F. L. Lewis. NONLINEAR NETWORK STRUCTURES FOR FEEDBACK CONTROL , 1999 .
[10] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[13] S. N. Balakrishnan,et al. Adaptive-critic based neural networks for aircraft optimal control , 1996 .
[14] George G. Lendaris,et al. Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[15] S. N. Balakrishnan,et al. State-constrained agile missile control with adaptive-critic-based neural networks , 2002, IEEE Trans. Control. Syst. Technol..
[16] Lyle Noakes,et al. Continuous-Time Adaptive Critics , 2007, IEEE Transactions on Neural Networks.
[17] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..
[18] Donald E. Kirk,et al. Optimal control theory : an introduction , 1970 .
[19] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[20] Frank L. Lewis,et al. 2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .
[21] Frank L. Lewis,et al. Online Synchronous Policy Iteration Method for Optimal Control , 2009 .
[22] G. Lewicki,et al. Approximation by Superpositions of a Sigmoidal Function , 2003 .
[23] Robert F. Stengel,et al. An adaptive critic global controller , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).
[24] Jennie Si,et al. Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence) , 2004 .
[25] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[26] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[27] Richard S. Sutton,et al. Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.
[28] Randal W. Beard,et al. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..
[29] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[30] J J Hopfield,et al. Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.