Learning to Play Board Games using Temporal Dierence Methods
暂无分享,去创建一个
Marco Wiering | Jan Peter Patist | Articial Intelligence | Henk Mannen | M. Wiering | H. Mannen | J. Patist | Articial Intelligence
[1] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[2] Edwin D. de Jong,et al. Ideal Evaluation from Coevolution , 2004, Evolutionary Computation.
[3] Jonathan Schaeffer,et al. CHINOOK: The World Man-Machine Checkers Champion , 1996, AI Mag..
[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[5] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[6] Berliner,et al. Search vs. knowledge : an analysis from the domain of games , 1981 .
[7] Alessandro Sperduti,et al. Speed up learning and network optimization with extended back propagation , 1993, Neural Networks.
[8] Jordan B. Pollack,et al. Why did TD-Gammon Work? , 1996, NIPS.
[9] Johannes Fürnkranz,et al. Machine Learning in Computer Chess: The Next Generation , 1996, J. Int. Comput. Games Assoc..
[10] Dap Hartmann,et al. MACHINES THAT LEARN TO PLAY GAMES , 2002 .
[11] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.
[12] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[13] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[14] Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.
[15] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[16] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[17] Aske Plaat,et al. Research, Re: Search and Re-Search , 1996, J. Int. Comput. Games Assoc..
[18] Hans J. Berliner,et al. Experiences in Evaluation with BKG - A Program that Plays Backgammon , 1977, IJCAI.
[19] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[20] Donald F. Beal,et al. Learning Piece Values Using Temporal Differences , 1997, J. Int. Comput. Games Assoc..
[21] Sebastian Thrun,et al. Explanation Based Learning: A Comparison of Symbolic and Neural Network Approaches , 1993, ICML.
[22] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[23] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[24] David E. Moriarty,et al. Symbiotic Evolution of Neural Networks in Sequential Decision Tasks , 1997 .
[25] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[26] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[27] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[28] Rich Caruana,et al. Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs , 1996, NIPS.
[29] David B. Fogel,et al. Evolving a checkers player without relying on human experience , 2000, INTL.
[30] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[31] Jonathan Schaeffer,et al. The games computers (and people) play , 2000, Adv. Comput..
[32] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[33] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[34] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[35] Justin A. Boyan,et al. Modular Neural Networks for Learning Context-Dependent Game Strategies , 2007 .
[36] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[38] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[39] A. D. D. Groot. Thought and Choice in Chess , 1978 .
[40] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.
[41] Andrew Tridgell,et al. KnightCap: A chess program that learns by combining TD( ) with minimax search , 1997, ICML 1997.
[42] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[43] Jonathan Schaeffer,et al. Kasparov versus Deep Blue: The Rematch , 1997, J. Int. Comput. Games Assoc..
[44] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[45] Sebastian Thrun,et al. Learning to Play the Game of Chess , 1994, NIPS.
[46] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[47] Andrew Tridgell,et al. KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search , 1998, ICML.