Learning and Using Models
暂无分享,去创建一个
[1] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[2] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .
[3] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[4] Peter Stone,et al. Real time targeted exploration in large domains , 2010, 2010 IEEE 9th International Conference on Development and Learning.
[5] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[6] Michael L. Littman,et al. Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.
[7] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[8] S. Schaal,et al. Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.
[9] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[10] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[11] Michail G. Lagoudakis,et al. Binary action search for learning continuous-action control policies , 2009, ICML '09.
[12] Peter Stone,et al. Structure Learning in Ergodic Factored MDPs without Knowledge of the Transition Function's In-Degree , 2011, ICML.
[13] Peter Stone,et al. Generalized model learning for reinforcement learning in factored domains , 2009, AAMAS.
[14] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[15] Neil D. Lawrence,et al. Missing Data in Kernel PCA , 2006, ECML.
[16] Pierre-Yves Oudeyer,et al. R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.
[17] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[18] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[19] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[20] Michael L. Littman,et al. Sample-Based Planning for Continuous Action Markov Decision Processes , 2011, ICAPS.
[21] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[22] Lihong Li,et al. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning , 2009, ICML '09.
[23] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[24] Stewart W. Wilson,et al. From Animals to Animats 5. Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 1997 .
[25] Dale Schuurmans,et al. Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs , 2002, ICML.
[26] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[27] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[28] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[29] Michael L. Littman,et al. Dimension reduction and its application to model-based exploration in continuous spaces , 2010, Machine Learning.
[30] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[31] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[32] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[33] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[34] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[35] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[36] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[37] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[38] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[39] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[40] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[41] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[42] Peter Stone,et al. RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.
[43] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[44] Peter Stone,et al. Model-based function approximation in reinforcement learning , 2007, AAMAS '07.
[45] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[46] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[47] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[48] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[49] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[50] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[51] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[52] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[53] Peter Stone,et al. Generalized model learning for Reinforcement Learning on a humanoid robot , 2010, 2010 IEEE International Conference on Robotics and Automation.
[54] Andrew McCallum,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[55] Olivier Sigaud,et al. Learning the structure of Factored Markov Decision Processes in reinforcement learning problems , 2006, ICML.
[56] Leslie Pack Kaelbling,et al. Learning Probabilistic Relational Planning Rules , 2004, ICAPS.
[57] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML.
[58] Michael Kearns,et al. Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.
[59] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[60] Lihong Li,et al. Online exploration in least-squares policy iteration , 2009, AAMAS.
[61] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[62] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.