论文信息 - Final Adaptation Reinforcement Learning for N-Player Games

Final Adaptation Reinforcement Learning for N-Player Games

This paper covers n-tuple-based reinforcement learning (RL) algorithms for games. We present new algorithms for TD-, SARSAand Q-learning which work seamlessly on various games with arbitrary number of players. This is achieved by taking a player-centered view where each player propagates his/her rewards back to previous rounds. We add a new element called Final Adaptation RL (FARL) to all these algorithms. Our main contribution is that FARL is a vitally important ingredient to achieve success with the player-centered view in various games. We report results on seven board games with 1, 2 and 3 players, including Othello, ConnectFour and Hex. In most cases it is found that FARL is important to learn a near-perfect playing strategy. All algorithms are available in the GBG framework on GitHub.

Samineh Bagheri | Wolfgang Konen

[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[2] Richard E. Korf. Multi-Player Alpha-Beta Pruning , 1991, Artif. Intell..

[3] Wolfgang Konen,et al. Temporal difference learning with eligibility traces for the game connect four , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[4] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[5] Wolfgang Konen,et al. Reinforcement Learning for Board Games: The Temporal Dierence Algorithm , 2015 .

[6] Wolfgang Konen. General Board Game Playing for Education and Research in Generic AI Game Learning , 2019, 2019 IEEE Conference on Games (CoG).

[7] Wojciech Jaskowski,et al. Temporal difference learning of N-tuple networks for the game 2048 , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[8] Donald F. Beal,et al. Temporal Coherence and Prediction Decay in TD Learning , 1999, IJCAI.

[9] Wolfgang Konen,et al. Online Adaptable Learning Rates for the Game Connect-4 , 2016, IEEE Transactions on Computational Intelligence and AI in Games.

[10] Dennis J. N. J. Soemers,et al. Ludii - The Ludemic General Game System , 2019, ECAI.

[11] Wojciech Jaśkowski,et al. Mastering 2048 With Delayed Temporal Coherence Learning, Multistage Weight Promotion, Redundant Encoding, and Carousel Shaping , 2016, IEEE Transactions on Games.

[12] I-Chen Wu,et al. An Agent for EinStein Würfelt Nicht! Using N-Tuple Networks , 2017, 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI).

[13] Wojciech Jaskowski,et al. High-Dimensional Function Approximation for Knowledge-Free Reinforcement Learning: a Case Study in SZ-Tetris , 2015, GECCO.

[14] Simon M. Lucas. Learning to Play Othello with N-Tuple Systems , 2008 .

[15] Marco Wiering,et al. Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).