Mastering the game of Go with deep neural networks and tree search
暂无分享,去创建一个
Demis Hassabis | Koray Kavukcuoglu | David Silver | Sander Dieleman | Thore Graepel | Ilya Sutskever | Julian Schrittwieser | Ioannis Antonoglou | Arthur Guez | Marc Lanctot | Laurent Sifre | Nal Kalchbrenner | Chris J. Maddison | Timothy P. Lillicrap | Dominik Grewe | Aja Huang | George van den Driessche | John Nham | Vedavyas Panneershelvam | Madeleine Leach | Aja Huang | L. Sifre | T. Lillicrap | K. Kavukcuoglu | D. Hassabis | D. Silver | A. Guez | Ilya Sutskever | Ioannis Antonoglou | T. Graepel | Marc Lanctot | S. Dieleman | Nal Kalchbrenner | Julian Schrittwieser | Vedavyas Panneershelvam | Dominik Grewe | John Nham | M. Leach | David Silver | I. Sutskever
[1] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[2] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.
[3] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.
[4] Hans J. Berliner,et al. A Chronology of Computer Chess and its Literature , 1978, Artif. Intell..
[5] Jonathan Schaeffer,et al. A World Championship Caliber Checkers Program , 1992, Artif. Intell..
[6] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.
[7] L. V. Allis,et al. Searching for solutions in games and artificial intelligence , 1994 .
[8] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[9] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[10] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[11] Ah Chung Tsoi,et al. Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.
[12] Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.
[13] D. A. Mechner,et al. All Systems Go , 1998 .
[14] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[15] Jonathan Schaeffer,et al. The games computers (and people) play , 2000, Adv. Comput..
[16] Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.
[17] Fredrik A. Dahl,et al. Honte, a go-playing program using neural nets , 2001 .
[18] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[19] Martin Müller,et al. Computer Go , 2002, Artif. Intell..
[20] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..
[21] H. Jaap van den Herik,et al. Games solved: Now and in the future , 2002, Artif. Intell..
[22] Markus Enzenberger,et al. Evaluation in Go by a Neural Network using Soft Segmentation , 2003, ACG.
[23] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.
[24] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[25] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[28] Thore Graepel,et al. Bayesian pattern ranking for move prediction in the game of Go , 2006, ICML.
[29] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[30] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[31] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[32] Jacek Mandziuk,et al. Computational Intelligence in Mind Games , 2007, Challenges for Computational Intelligence.
[33] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[34] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
[35] David Silver,et al. Combining Online and Offline Learning in UCT , 2007 .
[36] Rémi Coulom,et al. Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength , 2008, Computers and Games.
[37] Ilya Sutskever,et al. Mimicking Go Experts with Convolutional Neural Networks , 2008, ICANN.
[38] Gerald Tesauro,et al. Monte-Carlo simulation balancing , 2009, ICML '09.
[39] Joel Veness,et al. Bootstrapping from Game Tree Search , 2009, NIPS.
[40] Martin Müller,et al. A Lock-Free Multithreaded Monte-Carlo Tree Search Algorithm , 2009, ACG.
[41] Hendrik Baier,et al. The Power of Forgetting: Improving the Last-Good-Reply Policy in Monte Carlo Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[42] Martin Müller,et al. Fuego—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[43] Shih-Chieh Huang,et al. Monte-Carlo Simulation Balancing in Practice , 2010, Computers and Games.
[44] Shih-Chieh Huang,et al. Time Management for Monte-Carlo Tree Search Applied to the Game of Go , 2010, 2010 International Conference on Technologies and Applications of Artificial Intelligence.
[45] Richard B. Segal,et al. On the Scalability of Parallel UCT , 2010, Computers and Games.
[46] Mark H. M. Winands,et al. Active Opening Book Application for Monte-Carlo Tree Search in 19×19 Go , 2011 .
[47] Petr Baudis,et al. Balancing MCTS by Dynamically Adjusting the Komi Value , 2011, J. Int. Comput. Games Assoc..
[48] Petr Baudis,et al. PACHI: State of the Art Open Source Go Program , 2011, ACG.
[49] Christopher D. Rosin,et al. Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.
[50] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[51] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[52] Richard S. Sutton,et al. Temporal-difference search in computer Go , 2012, Machine Learning.
[53] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[54] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[55] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[56] Shih-Chieh Huang,et al. Investigating the Limits of Monte-Carlo Tree Search Methods in Computer Go , 2013, Computers and Games.
[57] Nathan R. Sturtevant,et al. Monte Carlo Tree Search with heuristic evaluations using implicit minimax backups , 2014, 2014 IEEE Conference on Computational Intelligence and Games.
[58] David Silver,et al. Move Evaluation in Go Using Deep Convolutional Neural Networks , 2014, ICLR.
[59] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[60] Amos J. Storkey,et al. Training Deep Convolutional Neural Networks to Play Go , 2015, ICML.
[61] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[62] Liuqing Yang,et al. Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond , 2016, IEEE/CAA Journal of Automatica Sinica.
[63] Lars Chittka,et al. Faculty Opinions recommendation of Mastering the game of Go with deep neural networks and tree search. , 2016 .
[64] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[65] Steffen Hölldobler,et al. Lessons Learned from AlphaGo , 2017, YSIP.
[66] 一樹 美添,et al. 5分で分かる! ? 有名論文ナナメ読み:Silver, D. et al. : Mastering the Game of Go without Human Knowledge , 2018 .
[67] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.