Practical Issues in Temporal Difference Learning
暂无分享,去创建一个
[1] P W Frey,et al. Algorithmic strategies for improving the performance of game-playing programs , 1986 .
[2] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[3] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[4] Sanjoy Mahajan,et al. A Pattern Classification Approach to Evaluation Function Learning , 1988, Artif. Intell..
[5] Arnold K. Griffith,et al. A Comparison and Evaluation of Three Machine Learning Procedures as Applied to the Game of Checkers , 1974, Artif. Intell..
[6] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[7] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[8] Norman Zadeh,et al. On Optimal Doubling in Backgammon , 1977 .
[9] Hans J. Berliner,et al. On the Construction of Evaluation Functions for Large Domains , 1979, IJCAI.
[10] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[11] Paul E. Utgoff,et al. Two Kinds of Training Information For Evaluation Function Learning , 1991, AAAI.
[12] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[13] Hans J. Berliner,et al. Experiences in Evaluation with BKG - A Program that Plays Backgammon , 1977, IJCAI.
[14] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers" in computers and thought eds , 1995 .
[15] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[16] Richard E. Korf,et al. A Unified Theory of Heuristic Evaluation Functions and its Application to Learning , 1986, AAAI.
[17] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .
[18] B. Widrow,et al. Stationary and nonstationary learning characteristics of the LMS adaptive filter , 1976, Proceedings of the IEEE.
[19] Gerald Tesauro,et al. Connectionist Learning of Expert Preferences by Comparison Training , 1988, NIPS.
[20] Gerald Tesauro,et al. Neurogammon: a neural-network backgammon program , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[21] Terrence J. Sejnowski,et al. A Parallel Network that Learns to Play Backgammon , 1989, Artif. Intell..
[22] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[23] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .