The Last-Good-Reply Policy for Monte-Carlo Go
暂无分享,去创建一个
[1] Jonathan Schaeffer,et al. The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..
[2] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[5] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[6] Y. Björnsson,et al. Simulation Control in General Game Playing Agents , 2009 .
[7] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[8] Bernd Brügmann Max-Planck. Monte Carlo Go , 1993 .
[9] Bruno Bouzy,et al. Monte-Carlo Go Reinforcement Learning Experiments , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.
[10] Donald C. Wunsch,et al. Computer Go: A Grand Challenge to AI , 2007, Challenges for Computational Intelligence.
[11] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[12] Olivier Teytaud,et al. Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search , 2009, ACG.
[13] Martin Müller,et al. Computer Go , 2002, Artif. Intell..
[14] Gerald Tesauro,et al. Monte-Carlo simulation balancing , 2009, ICML '09.