Learning to Search with MCTSnets
暂无分享,去创建一个
Rémi Munos | Oriol Vinyals | Karen Simonyan | David Silver | Daan Wierstra | Ioannis Antonoglou | Arthur Guez | Theophane Weber | Oriol Vinyals | D. Silver | A. Guez | R. Munos | Ioannis Antonoglou | Daan Wierstra | T. Weber | K. Simonyan | David Silver | O. Vinyals
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] Donald E. Knuth,et al. An Analysis of Alpha-Beta Pruning , 1975, Artif. Intell..
[3] Gerald Tesauro,et al. Connectionist Learning of Expert Preferences by Comparison Training , 1988, NIPS.
[4] Stuart J. Russell,et al. On Optimal Game-Tree Search using Rational Meta-Reasoning , 1989, IJCAI.
[5] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[6] Michael A. Arbib,et al. The handbook of brain theory and neural networks , 1995, A Bradford book.
[7] Stuart J. Russell. Rationality and Intelligence , 1995, IJCAI.
[8] Jonathan Baxter. KnightCap : A chess program that learns by combining TD ( ) with game-tree search , 1998 .
[9] Jonathan Schaeffer,et al. The games computers (and people) play , 2000, Adv. Comput..
[10] Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.
[11] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[12] Jonathan Schaeffer,et al. Using Abstraction for Planning in Sokoban , 2002, Computers and Games.
[13] Csaba Szepesvári,et al. RSPSA: Enhanced Parameter Optimization in Games , 2006, ACG.
[14] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[15] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[16] Joel Veness,et al. Bootstrapping from Game Tree Search , 2009, NIPS.
[17] M. Jünger,et al. 50 Years of Integer Programming 1958-2008 - From the Early Years to the State-of-the-Art , 2010 .
[18] Stuart J. Russell,et al. Metareasoning for Monte Carlo Tree Search , 2011 .
[19] Christopher D. Rosin,et al. Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.
[20] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[21] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[24] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[25] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[26] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.
[27] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[28] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning , 2017, ICLR 2018.