Monte-Carlo simulation balancing
暂无分享,去创建一个
[1] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[2] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
[3] Jonathan Schaeffer,et al. Using Probabilistic Knowledge and Simulation to Play Poker , 1999, AAAI/IAAI.
[4] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[5] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[6] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.
[7] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..
[8] David Silver,et al. Combining Online and Offline Learning in UCT , 2007 .
[9] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.
[12] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .