论文信息 - Neural Approximation of Monte Carlo Policy Evaluation Deployed in Connect Four

Neural Approximation of Monte Carlo Policy Evaluation Deployed in Connect Four

To win a board-game or more generally to gain something specific in a given Markov-environment, it is most important to have a policy in choosing and taking actions that leads to one of several qualitative good states. In this paper we describe a novel method to learn a game-winning strategy. The method predicts statistical probabilities to win in given game states using a state-value function that is approximated by a Multi-layer perceptron. Those predictions will improve according to rewards given in terminal states. We have deployed that method in the game Connect Four and have compared its game-performance with Velena [5].

Friedhelm Schwenker | Stefan Faußer

[1] Emile Fiesler,et al. Optimal Setting of Weights, Learning Rate, and Gain , 1997 .

[2] L. Victor Allis,et al. A Knowledge-Based Approach of Connect-Four , 1988, J. Int. Comput. Games Assoc..

[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[4] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[5] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[6] G. Lewicki,et al. Approximation by Superpositions of a Sigmoidal Function , 2003 .

[7] Emile Fiesler,et al. High-order and multilayer perceptron initialization , 1997, IEEE Trans. Neural Networks.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.