论文信息 - A scalable neural network architecture for board games

A scalable neural network architecture for board games

This paper proposes to use multi-dimensional recurrent neural networks (MDRNNs) as a way to overcome one of the key problems in flexible-size board games: scalability. We show why this architecture is well suited to the domain and how it can be successfully trained to play those games, even without any domain-specific knowledge. We find that performance on small boards correlates well with performance on large ones, and that this property holds for networks trained by either evolution or coevolution.

Tom Schaul | Jürgen Schmidhuber | J. Schmidhuber | T. Schaul

[1] H. Jaap van den Herik,et al. Solving Go on Small Boards , 2003, J. Int. Comput. Games Assoc..

[2] Risto Miikkulainen,et al. Evolving Neural Networks to Play Go , 2004, Applied Intelligence.

[3] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4] Risto Miikkulainen,et al. Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..

[5] Risto Miikkulainen,et al. Evolving a Roving Eye for Go , 2004, GECCO.

[6] Pierre Baldi,et al. The Principled Design of Large-Scale Recursive Neural Network Architectures--DAG-RNNs and the Protein Structure Prediction Problem , 2003, J. Mach. Learn. Res..

[7] Jürgen Schmidhuber,et al. Multidimensional Recurrent Neural Networks , 2007 .

[8] Kaoru Iwamoto,et al. Go for Beginners , 1976 .

[9] Simon M. Lucas,et al. Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[10] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[11] Jürgen Schmidhuber,et al. Multi-dimensional Recurrent Neural Networks , 2007, ICANN.

[12] Dave Cliff,et al. Tracking the Red Queen: Measurements of Adaptive Progress in Co-Evolutionary Simulations , 1995, ECAL.

[13] Lin Wu,et al. A Scalable Machine Learning Approach to Go , 2006, NIPS.

[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[15] James Foulds. Learning to Play the Game of Go , 2006 .

[16] Thomas Bäck,et al. A Survey of Evolution Strategies , 1991, ICGA.

[17] X. Pang,et al. Neural network design for J function approximation in dynamic programming , 1998, adap-org/9806001.

[18] Alex Lubberts and Risto Miikkulainen. Co-Evolving a Go-Playing Neural network , 2001 .

[19] Richard K. Belew,et al. Methods for Competitive Co-Evolution: Finding Opponents Worth Beating , 1995, ICGA.

[20] Nir Oren,et al. Evolving Neural Networks for the Capture Game , 2002 .

[21] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.

[22] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[23] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.