TD-Gammon: A Self-Teaching Backgammon Program
暂无分享,去创建一个
[1] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[2] Arnold K. Griffith,et al. A Comparison and Evaluation of Three Machine Learning Procedures as Applied to the Game of Checkers , 1974, Artif. Intell..
[3] Norman Zadeh,et al. On Optimal Doubling in Backgammon , 1977 .
[4] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .
[5] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[6] P W Frey,et al. Algorithmic strategies for improving the performance of game-playing programs , 1986 .
[7] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[8] Richard E. Korf,et al. A Unified Theory of Heuristic Evaluation Functions and its Application to Learning , 1986, AAAI.
[9] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[10] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[11] Gerald Tesauro,et al. Connectionist Learning of Expert Preferences by Comparison Training , 1988, NIPS.
[12] Sanjoy Mahajan,et al. A Pattern Classification Approach to Evaluation Function Learning , 1988, Artif. Intell..
[13] Terrence J. Sejnowski,et al. A Parallel Network that Learns to Play Backgammon , 1989, Artif. Intell..
[14] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[15] Gerald Tesauro,et al. Neurogammon: a neural-network backgammon program , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[16] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..