Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains
暂无分享,去创建一个
[1] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[2] Peter Stone,et al. Model-Based Exploration in Continuous State Spaces , 2007, SARA.
[3] D. Aldous. Random walks on finite groups and rapidly mixing markov chains , 1983 .
[4] John N. Tsitsiklis,et al. Bias and variance in value function estimation , 2004, ICML.
[5] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[6] J. Shao,et al. The jackknife and bootstrap , 1996 .
[7] Michael L. Littman,et al. Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.
[8] Erniel B. Barrios,et al. Bootstrap Methods , 2011, International Encyclopedia of Statistical Science.
[9] J. Neyman,et al. Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability , 1972 .
[10] Paulo Eduardo Oliveira,et al. Exponential rates for kernel density estimation under association , 2005 .
[11] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[12] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.
[13] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[14] P. Hall. The Bootstrap and Edgeworth Expansion , 1992 .
[15] H. Kile,et al. Bandwidth Selection in Kernel Density Estimation , 2010 .
[16] Richard Wyatt. Learning in embedded systems by Leslie Pack Kaelbling, Bradford Books. MIT Press, USA, 1993, pp 176, $29.95, ISBN 0-262-11174-8 , 1994, Knowl. Eng. Rev..
[17] Anthony C. Davison,et al. Bootstrap Methods and Their Application , 1998 .
[18] Michael L. Littman,et al. An empirical evaluation of interval estimation for Markov decision processes , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.
[19] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[20] Doina Precup,et al. Combining TD-learning with Cascade-correlation Networks , 2003, ICML.
[21] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[22] Raymond J. Mooney,et al. Using Active Relocation to Aid Reinforcement Learning , 2006, FLAIRS.
[23] Joel L. Horowitz,et al. Bootstrap Methods for Markov Processes , 2003 .
[24] Stephen M. S. Lee,et al. Double block bootstrap confidence intervals for dependent data , 2009 .
[25] Peter Stone,et al. Generalized model learning for reinforcement learning in factored domains , 2009, AAMAS.
[26] P. Hall,et al. On blocking rules for the bootstrap with dependent data , 1995 .
[27] Robert M. Gray,et al. Probability, Random Processes, And Ergodic Properties , 1987 .
[28] Lihong Li,et al. Online exploration in least-squares policy iteration , 2009, AAMAS.
[29] Gwilym M. Jenkins,et al. Time series analysis, forecasting and control , 1971 .
[30] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[31] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[32] D. Andrews,et al. The Block-Block Bootstrap: Improved Asymptotic Refinements , 2002 .
[33] Jānis Zvingelis,et al. On Bootstrap Coverage Probability with Dependent Data , 2000 .
[34] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[35] Aarnout Brombacher,et al. Probability... , 2009, Qual. Reliab. Eng. Int..
[36] Gwilym M. Jenkins,et al. Time series analysis, forecasting and control , 1972 .
[37] Michael L. Littman,et al. Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.
[38] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.
[39] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[40] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..