论文信息 - Learning to Play Othello with N-Tuple Systems

Learning to Play Othello with N-Tuple Systems

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches.

Simon M. Lucas

[1] Simon M. Lucas. Discriminative Training of the Scanning N-Tuple Classifier , 2003, IWANN.

[2] Daryl Essam. Book Review: Blondie24: Playing at the Edge of AI , 2004, Genetic Programming and Evolvable Machines.

[3] Siang Yew Chong,et al. Observing the evolution of neural networks learning to play the game of Othello , 2005, IEEE Transactions on Evolutionary Computation.

[4] Simon M. Lucas,et al. Statistical syntactic methods for high-performance OCR , 1996 .

[5] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[6] Sanjoy Mahajan,et al. A Pattern Classification Approach to Evaluation Function Learning , 1988, Artif. Intell..

[7] Michal Morciniec,et al. A Theoretical and Experimental Account of n-Tuple Classifier Performance , 1996, Neural Computation.

[8] Michael C. Fairhurst,et al. Bit plane decomposition and the scanning n-tuple classifier , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[9] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[10] David B. Fogel,et al. Evolving an expert checkers playing program without using human expertise , 2001, IEEE Trans. Evol. Comput..

[11] Simon M. Lucas,et al. Continuous n-tuple classifier and its application to real-time face recognition , 1998 .

[12] Jonathan Schaeffer,et al. Checkers Is Solved , 2007, Science.

[13] David B. Fogel,et al. Blondie24: Playing at the Edge of AI , 2001 .

[14] Julian R. Ullmann,et al. Experiments with the n-tuple Method of Pattern Recognition , 1969, IEEE Transactions on Computers.

[15] Simon M. Lucas,et al. Temporal Difference Learning Versus Co-Evolution for Acquiring Othello Position Evaluation , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.

[16] Michael Buro,et al. ProbCut: An Effective Selective Extension of the alphabeta Algorithm , 1995 .

[17] Shin Ishii,et al. Strategy Acquisition for the Game "Othello" Based on Reinforcement Learning , 1999, ICONIP.

[18] Sanjay Kaul,et al. Trial and error. How to avoid commonly encountered limitations of published clinical trials. , 2010, Journal of the American College of Cardiology.

[19] Kai-Fu Lee,et al. The Development of a World Class Othello Program , 1990, Artif. Intell..

[20] Jordan B. Pollack,et al. Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[21] Simon M. Lucas,et al. Fast convolutional OCR with the scanning N-tuple grid , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[22] Michael Buro,et al. ProbCut: An Effective Selective Extension of the α-β Algorithm , 1995, J. Int. Comput. Games Assoc..

[23] Simon M. Lucas. Computational intelligence and games: Challenges and opportunities , 2008, Int. J. Autom. Comput..

[24] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[25] David B. Fogel,et al. Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.

[26] Simon M. Lucas,et al. Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[27] Pentti Kanerva,et al. Sparse Distributed Memory , 1988 .

[28] W. W. Bledsoe,et al. Pattern recognition and reading by machine , 1959, IRE-AIEE-ACM '59 (Eastern).

[29] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.