Monte-Carlo Tree Search: To MC or to DP?
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Nathan R. Sturtevant,et al. An Analysis of UCT in Multi-Player Games , 2008, J. Int. Comput. Games Assoc..
[4] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[5] Frédérick Garcia,et al. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling , 2004, ECAI.
[6] Carmel Domshlak,et al. Simple Regret Optimization in Online Planning for Markov Decision Processes , 2012, J. Artif. Intell. Res..
[7] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[8] Malte Helmert,et al. High-Quality Policies for the Canadian Traveler's Problem , 2010, SOCS.
[9] Alan Fern,et al. Lower Bounding Klondike Solitaire with Monte-Carlo Planning , 2009, ICAPS.
[10] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[11] U. Rieder,et al. Markov Decision Processes , 2010 .
[12] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[13] Carmel Domshlak,et al. On MABs and Separation of Concerns in Monte-Carlo Planning for MDPs , 2014, ICAPS.
[14] Carmel Domshlak,et al. Monte-Carlo Planning: Theoretically Fast Convergence Meets Practical Efficiency , 2013, UAI.
[15] David Tolpin,et al. MCTS Based on Simple Regret , 2012, AAAI.
[16] Tristan Cazenave,et al. Nested Monte-Carlo Search , 2009, IJCAI.
[17] Rémi Munos,et al. Open Loop Optimistic Planning , 2010, COLT.
[18] Christopher D. Rosin,et al. Nested Rollout Policy Adaptation for Monte Carlo Tree Search , 2011, IJCAI.
[19] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[20] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[22] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[23] Malte Helmert,et al. Trial-Based Heuristic Tree Search for Finite Horizon MDPs , 2013, ICAPS.