Improving UCT planning via approximate homomorphisms
暂无分享,去创建一个
[1] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[2] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[3] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.
[4] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[5] Yishay Mansour,et al. Approximate Equivalence of Markov Decision Processes , 2003, COLT.
[6] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[7] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[8] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[9] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .
[10] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[12] Philip Hingston,et al. Experiments with Monte Carlo Othello , 2007, 2007 IEEE Congress on Evolutionary Computation.
[13] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[14] Alan Fern,et al. Lower Bounding Klondike Solitaire with Monte-Carlo Planning , 2009, ICAPS.
[15] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[16] Nataliya Sokolovska,et al. Continuous Upper Confidence Trees , 2011, LION.
[17] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[18] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[19] Tuomas Sandholm,et al. Lossy stochastic game abstraction with bounds , 2012, EC '12.
[20] Balaraman Ravindran. Approximate Homomorphisms : A framework for non-exact minimization in Markov Decision Processes , 2022 .