Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem
暂无分享,去创建一个
[1] K. N. Dollman,et al. - 1 , 1743 .
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] D. Kleinman. On an iterative technique for Riccati equation computations , 1968 .
[4] B. Finlayson. The method of weighted residuals and variational principles : with application in fluid mechanics, heat and mass transfer , 1972 .
[5] W. Ames. The Method of Weighted Residuals and Variational Principles. By B. A. Finlayson. Academic Press, 1972. 412 pp. $22.50. , 1973, Journal of Fluid Mechanics.
[6] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[7] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[8] Kurt Hornik,et al. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.
[9] A. Schaft. L/sub 2/-gain analysis of nonlinear systems and nonlinear state-feedback H/sub infinity / control , 1992 .
[10] Frank L. Lewis,et al. Aircraft Control and Simulation , 1992 .
[11] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[12] Eduardo Sontag,et al. Nonsmooth control-Lyapunov functions , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[13] Frank L. Lewis,et al. Neural net robot controller with guaranteed tracking performance , 1995, IEEE Trans. Neural Networks.
[14] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[15] J. Primbs,et al. Constrained nonlinear optimal control: a converse HJB approach , 1996 .
[16] Randal W. Beard,et al. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..
[17] F. Lewis,et al. Neural Network Control of Robot Arms and Nonlinear Systems , 1997 .
[18] I. Sandberg. Notes on uniform approximation of time-varying systems on finite time intervals , 1998 .
[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[20] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[21] R. Beard,et al. Successive collocation: an approximation to optimal nonlinear control , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).
[22] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.
[23] Van,et al. L2-Gain Analysis of Nonlinear Systems and Nonlinear State Feedback H∞ Control , 2004 .
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[26] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[27] James V. Candy,et al. Adaptive and Learning Systems for Signal Processing, Communications, and Control , 2006 .
[28] Lyle Noakes,et al. Continuous-Time Adaptive Critics , 2007, IEEE Transactions on Neural Networks.
[29] Frank L. Lewis,et al. Adaptive optimal control algorithm for continuous-time nonlinear systems based on policy iteration , 2008, 2008 47th IEEE Conference on Decision and Control.
[30] Victor M. Becerra,et al. Optimal control , 2008, Scholarpedia.
[31] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..
[32] Draguna Vrabie,et al. Adaptive optimal controllers based on Generalized Policy Iteration in a continuous-time framework , 2009, 2009 17th Mediterranean Conference on Control and Automation.
[33] Weifeng Liu,et al. Adaptive and Learning Systems for Signal Processing, Communication, and Control , 2010 .