Cross-Entropy for Monte-Carlo Tree Search
暂无分享,去创建一个
H. Jaap van den Herik | Mark H. M. Winands | Guillaume Chaslot | István Szita | I. Szita | Guillaume Chaslot | M. Winands | H. J. Herik
[1] Tristan Cazenave,et al. Playing the Right Atari , 2007, J. Int. Comput. Games Assoc..
[2] Heinz Mühlenbein,et al. The Equation for Response to Selection and Its Use for Prediction , 1997, Evolutionary Computation.
[3] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..
[4] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[5] T. Anthony Marsland,et al. Learning extension parameters in game-tree search , 2003, Inf. Sci..
[6] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[7] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
[8] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[9] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[10] Donald E. Knuth,et al. An Analysis of Alpha-Beta Pruning , 1975, Artif. Intell..
[11] Bruno Bouzy,et al. Monte-Carlo Go Reinforcement Learning Experiments , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.
[12] Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.
[13] Csaba Szepesvári,et al. RSPSA: Enhanced Parameter Optimization in Games , 2006, ACG.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] R. Rubinstein. The Cross-Entropy Method for Combinatorial and Continuous Optimization , 1999 .
[16] Andrew Tridgell,et al. Experiments in Parameter Learning Using Temporal Differences , 1998, J. Int. Comput. Games Assoc..
[17] Bruno Bouzy,et al. Associating domain-dependent knowledge and Monte Carlo approaches within a Go program , 2005, Inf. Sci..
[18] Donald F. Beal,et al. Temporal Difference Learning for Heuristic Search and Game Playing , 2000, Inf. Sci..
[19] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[20] Jos W. H. M. Uiterwijk,et al. Temporal Difference Learning and the Neural MoveMap Heuristic in the Game of Lines of Action , 2002 .
[21] Dirk P. Kroese,et al. Convergence properties of the cross-entropy method for discrete optimization , 2007, Oper. Res. Lett..
[22] H. Robbins. A Stochastic Approximation Method , 1951 .
[23] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[24] Erik C. D. van der Werf. STEENVRETER WINS 9x9 GO TOURNAMENT , 2007 .