论文信息 - Lessons Learned from AlphaGo

Lessons Learned from AlphaGo

The game of Go is known to be one of the most complicated board games. Competing in Go against a professional human player has been a long-standing challenge for AI. In this paper we shed light on the AlphaGo program that could beat a Go world champion, which was previously considered non-achievable for the state of the art AI.

[1] Richard E. Korf,et al. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..

[2] D. A. Mechner,et al. All Systems Go , 1998 .

[3] John Tromp,et al. Combinatorics of Go , 2006, Computers and Games.

[4] Amos J. Storkey,et al. Training Deep Convolutional Neural Networks to Play Go , 2015, ICML.

[5] Hendrik Baier,et al. Adaptive Playout Policies for Monte-Carlo Go , 2010 .

[6] Pieter Spronck,et al. Monte-Carlo Tree Search: A New Framework for Game AI , 2008, AIIDE.

[7] Bernd Brügmann Max-Planck. Monte Carlo Go , 1993 .

[8] Martin Müller,et al. Computer Go , 2002, Artif. Intell..

[9] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..

[10] Shirish Chinchalkar,et al. An Upper Bound for the Number of Reachable Positions , 1996, J. Int. Comput. Games Assoc..

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] W. Pitts,et al. A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[13] Shang-Rong Tsai,et al. Current Frontiers in Computer Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[14] Peter Lewis,et al. MOVE ORDERING VS HEAVY PLAYOUTS : WHERE SHOULD HEURISTICS BE APPLIED IN MONTE CARLO GO ? , 2007 .

[15] Jordan B. Pollack,et al. Methods for statistical inference: extending the evolutionary computation paradigm , 1999 .

[16] David Silver,et al. Move Evaluation in Go Using Deep Convolutional Neural Networks , 2014, ICLR.

[17] Yngvi Björnsson,et al. Learning Simulation Control in General Game-Playing Agents , 2010, AAAI.

[18] Bruno Bouzy,et al. Computer Go: An AI oriented survey , 2001, Artif. Intell..

[19] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.

[21] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[22] Claudio Moraga,et al. The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning , 1995, IWANN.

[23] Ronald L. Wasserstein,et al. Monte Carlo: Concepts, Algorithms, and Applications , 1997 .

[24] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[25] Liuqing Yang,et al. Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond , 2016, IEEE/CAA Journal of Automatica Sinica.

[26] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[27] Shih-Chieh Huang,et al. Investigating the Limits of Monte-Carlo Tree Search Methods in Computer Go , 2013, Computers and Games.

[28] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[29] Jan Willemson,et al. Improved Monte-Carlo Search , 2006 .

[30] T. Raiko. Towards Super-Human Artificial Intelligence in Go by Further Improvements of AlphaGo Tapani Raiko , 2016 .

[31] G. Palm. Warren McCulloch and Walter Pitts: A Logical Calculus of the Ideas Immanent in Nervous Activity , 1986 .

[32] Donald C. Wunsch,et al. Computer Go: A Grand Challenge to AI , 2007, Challenges for Computational Intelligence.

[33] Fredrik A. Dahl,et al. Honte, a go-playing program using neural nets , 2001 .

[34] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[35] Bruno Bouzy,et al. Associating Shallow and Selective Global Tree Search with Monte Carlo for 9*9 Go , 2004, Computers and Games.

[36] Gerald Tesauro,et al. Monte-Carlo simulation balancing , 2009, ICML '09.

[37] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.

[38] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[39] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.

[40] Bruno Bouzy,et al. Move-Pruning Techniques for Monte-Carlo Go , 2006, ACG.