Finite horizon discrete-time approximate dynamic programming
暂无分享,去创建一个
[1] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[2] W. Wonham. Random differential equations in control theory , 1970 .
[3] H. Kang,et al. Optimal control of nonlinear stochastic systems , 1971 .
[4] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..
[5] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[6] Averill M. Law,et al. The art and theory of dynamic programming , 1977 .
[7] G. Gopalakrishnan Nair,et al. Suboptimal control of nonlinear systems , 1978, Autom..
[8] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[9] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[10] Kumpati S. Narendra,et al. Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.
[11] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.
[12] Paul J. Werbos,et al. Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.
[13] Kumpati S. Narendra,et al. Control of nonlinear dynamical systems using neural networks: controllability and stabilization , 1993, IEEE Trans. Neural Networks.
[14] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .
[15] S. N. Balakrishnan,et al. Adaptive-critic based neural networks for aircraft optimal control , 1996 .
[16] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[17] S. N. Balakrishnan,et al. A neighboring optimal adaptive critic for missile guidance , 1996 .
[18] R. Saeks,et al. On the design of a neural network autolander , 1999 .
[19] Paul J. Werbos,et al. Stable adaptive control using new critic designs , 1998, Other Conferences.
[20] Jennie Si,et al. Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[21] George G. Lendaris,et al. A radial basis function implementation of the adaptive dynamic programming algorithm , 2002, The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002..
[22] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.
[23] Derong Liu,et al. Call admission control for CDMA cellular networks using adaptive critic designs , 2003, Proceedings of the 2003 IEEE International Symposium on Intelligent Control.
[24] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[25] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.