论文信息 - What a Neural Network Can Learn About Othello

What a Neural Network Can Learn About Othello

Conventional Othello programs are based on a thorough analysis of the game, and typically employ sophisticated evaluation functions and supervised learning techniques that use large expert-labeled game databases. This paper presents an alternative method that trains a neural network to evaluate Othello positions via temporal difference (TD) learning. The approach is based on a network architecture that reflects the spatial and temporal organization of the problem domain. The network begins with random weights, and through self-play achieves an intermediate level of play. We also present a simple and effective method for analyzing what the network learned.

P. E. Utgoff | A. V. Leouski

[1] Kai-Fu Lee,et al. The Development of a World Class Othello Program , 1990, Artif. Intell..

[2] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[3] Jonathan Baxter,et al. Learning internal representations , 1995, COLT '95.

[4] Risto Miikkulainen,et al. Evolving Complex Othello Strategies Using Marker-Based Genetic Encoding ofNeural Networks , 1993 .

[5] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[6] Michael I. Jordan,et al. Task Decomposition through Competition in A , 1990 .

[7] Hans J. Berliner,et al. On the Construction of Evaluation Functions for Large Domains , 1979, IJCAI.

[8] Michael I. Jordan,et al. Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[9] Sanjoy Mahajan,et al. A Pattern Classification Approach to Evaluation Function Learning , 1988, Artif. Intell..

[10] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[11] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[12] Paul S. Rosenbloom,et al. A World-Championship-Level Othello Program , 1982, Artif. Intell..

[13] TesauroGerald. Practical Issues in Temporal Difference Learning , 1992 .