Simple and Fast Calculation of the Second-Order Gradients for Globalized Dual Heuristic Dynamic Programming in Neural Networks
暂无分享,去创建一个
[1] Louis B. Rall,et al. Automatic Differentiation: Techniques and Applications , 1981, Lecture Notes in Computer Science.
[2] Michael Fairbank,et al. The Local Optimality of Reinforcement Learning by Value Gradients, and its Relationship to Policy Gradient Learning , 2011, ArXiv.
[3] George G. Lendaris,et al. Training strategies for critic and action neural networks in dual heuristic programming method , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).
[4] Alexander Linden,et al. Inversion of neural networks by gradient descent , 1990, Parallel Comput..
[5] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[6] Zhong-Ping Jiang,et al. Approximate Dynamic Programming for Optimal Stationary Control With Control-Dependent Noise , 2011, IEEE Transactions on Neural Networks.
[7] Haibo He,et al. Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.
[8] R. Bellman. Dynamic programming. , 1957, Science.
[9] D. Liu,et al. Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.
[10] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.
[11] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[12] M. F. Møller,et al. Exact Calculation of the Product of the Hessian Matrix of Feed-Forward Network Error Functions and a Vector in 0(N) Time , 1993 .
[13] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[14] Michael Fairbank,et al. Value-gradient learning , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[15] S. F. R. F. Stengel. 3 Model-Based Adaptive Critic Designs , 2004 .
[16] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[17] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[18] Khan M. Iftekharuddin,et al. Transformation Invariant On-Line Target Recognition , 2011, IEEE Transactions on Neural Networks.
[19] Michael Fairbank,et al. Reinforcement Learning by Value Gradients , 2008, ArXiv.
[20] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[21] P. Werbos. Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities , 2006 .
[22] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.