暂无分享,去创建一个
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[3] Andrew G. Barto,et al. Local Bandit Approximation for Optimal Learning Problems , 1996, NIPS.
[4] S. MacEachern,et al. Estimating mixture of dirichlet process models , 1998 .
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[7] M. Escobar,et al. Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .
[8] B. E. Eckbo,et al. Appendix , 1826, Epilepsy Research.
[9] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[10] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[11] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[12] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[13] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[14] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[17] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[18] Michael L. Littman,et al. Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.
[19] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[20] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[21] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[22] David Silver,et al. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .
[23] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[24] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[25] Thomas J. Walsh,et al. Exploring compact reinforcement-learning representations with linear regression , 2009, UAI.
[26] Michael L. Littman,et al. A unifying framework for computational reinforcement learning theory , 2009 .
[27] Richard L. Lewis,et al. Variance-Based Rewards for Approximate Bayesian Reinforcement Learning , 2010, UAI.
[28] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.