Technical Update: Least-Squares Temporal Difference Learning
暂无分享,去创建一个
[1] William H. Press,et al. Numerical Recipes in C The Art of Scientific Computing , 1995 .
[2] William H. Press,et al. Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .
[3] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[6] William H. Press,et al. Numerical recipes in C (2nd ed.): the art of scientific computing , 1992 .
[7] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[8] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[9] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[10] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[11] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[12] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[13] Christopher G. Atkeson,et al. Nonparametric Model-Based Reinforcement Learning , 1997, NIPS.
[14] Andrew W. Moore,et al. Learning evaluation functions for global optimization , 1998 .
[15] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[16] Andrew W. Moore,et al. Learning Evaluation Functions for Global Optimization and Boolean Satisfiability , 1998, AAAI/IAAI.
[17] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[18] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[19] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[21] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[22] R. Sutton. Gain Adaptation Beats Least Squares , 2006 .