论文信息 - Using Reinforcement Learning in Chess Engines

Using Reinforcement Learning in Chess Engines

Up until recently, the use of reinforcement learning (RL) in chess programming has been problematic and failed to yield the expected results. The breakthrough was finally achieved through Gerald’s Tesauros work on backgammon, which resulted in a program that could beat the world champion of backgammon in the majority of the matches they played. Our chess engine proved that reinforcement learning in combination with the classification of board state leads to a notable improvement, when compared with other engines that only use reinforcement learning, such as KnightCap. We extended KnightCap’s learning algorithm by using a bigger and more complete board state database, and adjusting and optimizing the coefficients for each position class individually. A clear enhancement of our engine’s learning and playing skills is reached after only a few trained games.

[1] Gerald Tesauro,et al. Comparison training of chess evaluation functions , 2001 .

[2] Andrew Tridgell,et al. KnightCap: A chess program that learns by combining TD( ) with minimax search , 1997, ICML 1997.

[3] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[4] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[5] Andrew Tridgell,et al. Experiments in Parameter Learning Using Temporal Differences , 1998, J. Int. Comput. Games Assoc..

[6] Michael Gherrity,et al. A game-learning machine , 1993 .

[7] Byoung-Tak Zhang. Lernen durch Genetisch-Neuronale Evolution: Aktive Anpassung an unbekannte Umgebungen mit selbstentwickelten parallelen Netzwerken , 1992, DISKI.

[8] Sebastian Thrun,et al. Learning to Play the Game of Chess , 1994, NIPS.

[9] Aske Plaat,et al. RESEARCH RE: SEARCH & RE-SEARCH , 1996 .

[10] Nikhil Deshpande,et al. Temporal Difference Learning in Chinese Chess , 1998, IEA/AIE.

[11] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[12] George F. Luger,et al. Künstliche Intelligenz - Strategien zur Lösung komplexer Probleme (4. Aufl.) , 2001 .

[13] Raúl Rojas,et al. Theorie der neuronalen Netze , 1993 .