The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
暂无分享,去创建一个
[1] Pieter Abbeel,et al. Learning Factor Graphs in Polynomial Time and Sample Complexity , 2006, J. Mach. Learn. Res..
[2] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[3] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[4] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[5] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[6] Michael L. Littman,et al. A unifying framework for computational reinforcement learning theory , 2009 .
[7] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[8] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[9] Michael L. Littman,et al. Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.
[10] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML.
[11] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[12] Nicholas Roy,et al. CORL: A Continuous-state Offset-dynamics Reinforcement Learner , 2008, UAI.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] David Haussler,et al. How to use expert advice , 1993, STOC.
[15] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[16] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[17] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.
[18] Alexander L. Strehl,et al. Model-Based Reinforcement Learning in Factored-State MDPs , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[19] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[20] Kenji Yamanishi,et al. A learning criterion for stochastic rules , 1990, COLT '90.
[21] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[22] Michael Kearns,et al. Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.
[23] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.