论文信息 - Monte-Carlo tree search and rapid action value estimation in computer Go - 字舞流文

Monte-Carlo tree search and rapid action value estimation in computer Go

David Silver | Sylvain Gelly | D. Silver | S. Gelly | David Silver

[1] Ryan B. Hayward,et al. MOHEX Wins Hex Tournament , 2012, J. Int. Comput. Games Assoc..

[2] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[3] Martin Müller,et al. Fuego—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[4] Shih-Chieh Huang,et al. Monte-Carlo Simulation Balancing in Practice , 2010, Computers and Games.

[5] Michèle Sebag,et al. Feature Selection as a One-Player Game , 2010, ICML.

[6] Thierry Moudenc,et al. Introduction of a new paraphrase generation tool based on Monte-Carlo sampling , 2009, ACL.

[7] Alan Fern,et al. UCT for Tactical Assault Planning in Real-Time Strategy Games , 2009, IJCAI.

[8] Gerald Tesauro,et al. Monte-Carlo simulation balancing , 2009, ICML '09.

[9] Mark H. M. Winands,et al. Evaluation Function Based Monte-Carlo LOA , 2009, ACG.

[10] Olivier Teytaud,et al. Creating an Upper-Confidence-Tree Program for Havannah , 2009, ACG.

[11] David Silver,et al. Reinforcement learning and simulation-based search in computer go , 2009 .

[12] David Silver,et al. Reinforcement Learning and Simulation Based Search in the Game of Go , 2009 .

[13] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[14] Richard J. Lorentz. Amazons Discover Monte-Carlo , 2008, Computers and Games.

[15] Nathan R. Sturtevant,et al. An Analysis of UCT in Multi-Player Games , 2008, J. Int. Comput. Games Assoc..

[16] David Silver,et al. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .

[17] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.

[18] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.

[19] Olivier Teytaud,et al. On the Parallelization of Monte-Carlo planning , 2008, ICINCO 2008.

[20] Olivier Teytaud,et al. The Parallelization of Monte-Carlo Planning - Parallelization of MC-Planning , 2008, ICINCO-ICSO.

[21] S. Gelly,et al. Combining expert, offline, transient and online knowledge in Monte-Carlo exploration , 2008 .

[22] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.

[23] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[24] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[25] David Silver,et al. Combining Online and Offline Learning in UCT , 2007 .

[26] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..

[27] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[28] Bruno Bouzy,et al. Move-Pruning Techniques for Monte-Carlo Go , 2006, ACG.

[29] Bruno Bouzy,et al. HISTORY AND TERRITORY HEURISTICS FOR MONTE CARLO GO , 2006 .

[30] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[31] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[32] Bruno Bouzy,et al. Associating domain-dependent knowledge and Monte Carlo approaches within a Go program , 2005, Inf. Sci..

[33] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[34] Bruno Bouzy,et al. Associating Shallow and Selective Global Tree Search with Monte Carlo for 9*9 Go , 2004, Computers and Games.

[35] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[36] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.

[37] Markus Enzenberger,et al. Evaluation in Go by a Neural Network using Soft Segmentation , 2003, ACG.

[38] Martin Müller,et al. Computer Go , 2002, Artif. Intell..

[39] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..

[40] Dap Hartmann,et al. MACHINES THAT LEARN TO PLAY GAMES , 2002 .

[41] Bruno Bouzy,et al. Computer Go: An AI oriented survey , 2001, Artif. Intell..

[42] Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.

[43] Johannes Fürnkranz,et al. Machines that learn to play games , 2001 .

[44] Fredrik A. Dahl,et al. Honte, a go-playing program using neural nets , 2001 .

[45] Jonathan Schaeffer,et al. The games computers (and people) play , 2000, Adv. Comput..

[46] Jonathan Schaeffer,et al. Using Probabilistic Knowledge and Simulation to Play Poker , 1999, AAAI/IAAI.

[47] Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.

[48] J. McCarthy. AI as Sport , 1997, Science.

[49] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[50] M. Enzenberger. The Integration of A Priori Knowledge into a Go Playing Neural Network , 1996 .

[51] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[52] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.

[53] Bernd Brügmann Max-Planck. Monte Carlo Go , 1993 .

[54] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[55] Bruce Abramson,et al. Expected-Outcome: A General Model of Static Evaluation , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[56] Jonathan Schaeffer,et al. The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[57] D. Sandbach. All systems go. , 1986, The Health service journal.