Learning to Play Othello with N-Tuple Systems

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches.

[1]  Simon M. Lucas Discriminative Training of the Scanning N-Tuple Classifier , 2003, IWANN.

[2]  Daryl Essam Book Review: Blondie24: Playing at the Edge of AI , 2004, Genetic Programming and Evolvable Machines.

[3]  Siang Yew Chong,et al.  Observing the evolution of neural networks learning to play the game of Othello , 2005, IEEE Transactions on Evolutionary Computation.

[4]  Simon M. Lucas,et al.  Statistical syntactic methods for high-performance OCR , 1996 .

[5]  Arthur L. Samuel,et al.  Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[6]  Sanjoy Mahajan,et al.  A Pattern Classification Approach to Evaluation Function Learning , 1988, Artif. Intell..

[7]  Michal Morciniec,et al.  A Theoretical and Experimental Account of n-Tuple Classifier Performance , 1996, Neural Computation.

[8]  Michael C. Fairhurst,et al.  Bit plane decomposition and the scanning n-tuple classifier , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[9]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[10]  David B. Fogel,et al.  Evolving an expert checkers playing program without using human expertise , 2001, IEEE Trans. Evol. Comput..

[11]  Simon M. Lucas,et al.  Continuous n-tuple classifier and its application to real-time face recognition , 1998 .

[12]  Jonathan Schaeffer,et al.  Checkers Is Solved , 2007, Science.

[13]  David B. Fogel,et al.  Blondie24: Playing at the Edge of AI , 2001 .

[14]  Julian R. Ullmann,et al.  Experiments with the n-tuple Method of Pattern Recognition , 1969, IEEE Transactions on Computers.

[15]  Simon M. Lucas,et al.  Temporal Difference Learning Versus Co-Evolution for Acquiring Othello Position Evaluation , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.

[16]  Michael Buro,et al.  ProbCut: An Effective Selective Extension of the alphabeta Algorithm , 1995 .

[17]  Shin Ishii,et al.  Strategy Acquisition for the Game "Othello" Based on Reinforcement Learning , 1999, ICONIP.

[18]  Sanjay Kaul,et al.  Trial and error. How to avoid commonly encountered limitations of published clinical trials. , 2010, Journal of the American College of Cardiology.

[19]  Kai-Fu Lee,et al.  The Development of a World Class Othello Program , 1990, Artif. Intell..

[20]  Jordan B. Pollack,et al.  Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[21]  Simon M. Lucas,et al.  Fast convolutional OCR with the scanning N-tuple grid , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[22]  Michael Buro,et al.  ProbCut: An Effective Selective Extension of the α-β Algorithm , 1995, J. Int. Comput. Games Assoc..

[23]  Simon M. Lucas Computational intelligence and games: Challenges and opportunities , 2008, Int. J. Autom. Comput..

[24]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[25]  David B. Fogel,et al.  Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.

[26]  Simon M. Lucas,et al.  Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[27]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[28]  W. W. Bledsoe,et al.  Pattern recognition and reading by machine , 1959, IRE-AIEE-ACM '59 (Eastern).

[29]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.