A DVANTAGE U PDATING
暂无分享,去创建一个
[1] L. Baird,et al. A MATHEMATICAL ANALYSIS OF ACTOR-CRITIC ARCHITECTURES FOR LEARNING OPTIMAL CONTROLS THROUGH INCREMENTAL DYNAMIC PROGRAMMING (cid:3) , 1990 .
[2] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[3] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[4] James S. Morgan,et al. A Hierarchical Network of Control Systems that Learn: Modeling Nervous System Function During Classical and Instrumental Conditioning , 1993, Adapt. Behav..
[5] A. Harry Klopf,et al. A Hierarchical Network of Provably Optimal Learning Control Systems: Extensions of the Associative Control Process (ACP) Network , 1993, Adapt. Behav..
[6] Donald A. Sofge,et al. Neural network based process optimization and control , 1990, 29th IEEE Conference on Decision and Control.
[7] C. Watkins. Learning from delayed rewards , 1989 .
[8] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[9] L. C. Baird. Function minimization for dynamic programming using connectionist networks , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.
[10] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[11] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[12] Steven J. Bradtke,et al. Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.
[13] Ronald J. Williams,et al. Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .
[14] A. Harry Klopf,et al. Advantage Updating Applied to a Differrential Game , 1994, NIPS.
[15] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[16] B. Widrow,et al. Neural networks for self-learning control systems , 1990, IEEE Control Systems Magazine.
[17] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[18] M. F.,et al. Bibliography , 1985, Experimental Gerontology.
[19] Leemon C Baird,et al. Reinforcement Learning With High-Dimensional, Continuous Actions , 1993 .
[20] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.