A scalable neural network architecture for board games

This paper proposes to use multi-dimensional recurrent neural networks (MDRNNs) as a way to overcome one of the key problems in flexible-size board games: scalability. We show why this architecture is well suited to the domain and how it can be successfully trained to play those games, even without any domain-specific knowledge. We find that performance on small boards correlates well with performance on large ones, and that this property holds for networks trained by either evolution or coevolution.

[1]  H. Jaap van den Herik,et al.  Solving Go on Small Boards , 2003, J. Int. Comput. Games Assoc..

[2]  Risto Miikkulainen,et al.  Evolving Neural Networks to Play Go , 2004, Applied Intelligence.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Risto Miikkulainen,et al.  Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..

[5]  Risto Miikkulainen,et al.  Evolving a Roving Eye for Go , 2004, GECCO.

[6]  Pierre Baldi,et al.  The Principled Design of Large-Scale Recursive Neural Network Architectures--DAG-RNNs and the Protein Structure Prediction Problem , 2003, J. Mach. Learn. Res..

[7]  Jürgen Schmidhuber,et al.  Multidimensional Recurrent Neural Networks , 2007 .

[8]  Kaoru Iwamoto,et al.  Go for Beginners , 1976 .

[9]  Simon M. Lucas,et al.  Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[10]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[11]  Jürgen Schmidhuber,et al.  Multi-dimensional Recurrent Neural Networks , 2007, ICANN.

[12]  Dave Cliff,et al.  Tracking the Red Queen: Measurements of Adaptive Progress in Co-Evolutionary Simulations , 1995, ECAL.

[13]  Lin Wu,et al.  A Scalable Machine Learning Approach to Go , 2006, NIPS.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  James Foulds Learning to Play the Game of Go , 2006 .

[16]  Thomas Bäck,et al.  A Survey of Evolution Strategies , 1991, ICGA.

[17]  X. Pang,et al.  Neural network design for J function approximation in dynamic programming , 1998, adap-org/9806001.

[18]  Alex Lubberts and Risto Miikkulainen Co-Evolving a Go-Playing Neural network , 2001 .

[19]  Richard K. Belew,et al.  Methods for Competitive Co-Evolution: Finding Opponents Worth Beating , 1995, ICGA.

[20]  Nir Oren,et al.  Evolving Neural Networks for the Capture Game , 2002 .

[21]  Terrence J. Sejnowski,et al.  Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.

[22]  Richard S. Sutton,et al.  Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[23]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.