The Dependence of Effective Planning Horizon on Model Accuracy
暂无分享,去创建一个
Nan Jiang | Alex Kulesza | Richard L. Lewis | Satinder P. Singh | Satinder Singh | Alex Kulesza | Nan Jiang
[1] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[2] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[3] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[4] Balaraman Ravindran. Approximate Homomorphisms : A framework for non-exact minimization in Markov Decision Processes , 2022 .
[5] Ambuj Tewari,et al. Sample Complexity of Policy Search with Known Dynamics , 2006, NIPS.
[6] Csaba Szepesvári,et al. Error Propagation for Approximate Policy and Value Iteration , 2010, NIPS.
[7] John N. Tsitsiklis,et al. Bias and Variance Approximation in Value Function Estimates , 2007, Manag. Sci..
[8] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[9] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[10] Ran El-Yaniv,et al. Transductive Rademacher Complexity and Its Applications , 2007, COLT.
[11] Xiaojin Zhu,et al. Human Rademacher Complexity , 2009, NIPS.
[12] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[13] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[14] Marek Petrik,et al. Biasing Approximate Dynamic Programming with a Lower Discount Factor , 2008, NIPS.
[15] Nan Jiang,et al. Improving UCT planning via approximate homomorphisms , 2014, AAMAS.
[16] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[17] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[18] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[19] M. D. Wilkinson,et al. Management science , 1989, British Dental Journal.
[20] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[21] V. Koltchinskii,et al. Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.
[22] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.