An empirical evaluation of interval estimation for Markov decision processes
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Local Bandit Approximation for Optimal Learning Problems , 1996, NIPS.
[2] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[3] Terrence J. Sejnowski,et al. Exploration Bonuses and Dual Control , 1996, Machine Learning.
[4] P. Dayan,et al. Exploration bonuses and dual control , 1996 .
[5] Shie Mannor,et al. Action Elimination and Stopping Conditions for Reinforcement Learning , 2003, ICML.
[6] Donald A. Berry,et al. Bandit Problems: Sequential Allocation of Experiments. , 1986 .
[7] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .
[8] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[11] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[12] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[13] Jeremy L. Wyatt,et al. Exploration Control in Reinforcement Learning using Optimistic Model Selection , 2001, ICML.
[14] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[15] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[16] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[17] Philip W. L. Fong. A Quantitative Study of Hypothesis Selection , 1995, ICML.
[18] David A. McAllester,et al. On the Convergence Rate of Good-Turing Estimators , 2000, COLT.
[19] M. Littman,et al. Exploration via Model-based Interval Estimation , 2004 .
[20] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[21] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[22] Reid G. Simmons,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1993, AAAI.
[23] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[24] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[25] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.