TEXPLORE: real-time sample-efficient reinforcement learning for robots
暂无分享,去创建一个
[1] James S. Albus,et al. I A New Approach to Manipulator Control: The I Cerebellar Model Articulation Controller , 1975 .
[2] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[3] C. Watkins. Learning from delayed rewards , 1989 .
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[6] J. R. Quinlan. Learning With Continuous Classes , 1992 .
[7] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[8] Frans M. J. Willems,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.
[9] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[10] Andrew McCallum,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[11] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[12] Stewart W. Wilson,et al. From Animals to Animats 5. Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 1997 .
[13] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[14] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[15] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[16] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[17] Dale Schuurmans,et al. Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs , 2002, ICML.
[18] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[19] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[20] M. Mauk,et al. What the cerebellum computes , 2003, Trends in Neurosciences.
[21] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[22] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[23] Konstantinos V. Katsikopoulos,et al. Markov decision processes with delays and asynchronous cost collection , 2003, IEEE Trans. Autom. Control..
[24] Pierre Geurts,et al. Iteratively Extending Time Horizon Reinforcement Learning , 2003, ECML.
[25] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[26] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[27] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[28] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[29] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[30] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[31] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[32] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[33] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[34] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[35] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[36] Olivier Sigaud,et al. Learning the structure of Factored Markov Decision Processes in reinforcement learning problems , 2006, ICML.
[37] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[38] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[39] Michael L. Littman,et al. Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.
[40] Michael L. Littman,et al. Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.
[41] Peter Stone,et al. Model-based function approximation in reinforcement learning , 2007, AAMAS '07.
[42] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[43] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[44] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML.
[45] H. Jaap van den Herik,et al. Parallel Monte-Carlo Tree Search , 2008, Computers and Games.
[46] Peter Stone,et al. Multiagent interactions in urban driving , 2008 .
[47] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[48] Thomas J. Walsh,et al. Learning and planning in environments with delayed feedback , 2009, Autonomous Agents and Multi-Agent Systems.
[49] Olivier Teytaud,et al. On the Parallelization of Monte-Carlo planning , 2008, ICINCO 2008.
[50] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[51] Lihong Li,et al. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning , 2009, ICML '09.
[52] Thomas J. Walsh,et al. Exploring compact reinforcement-learning representations with linear regression , 2009, UAI.
[53] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..
[54] Morgan Quigley,et al. ROS: an open-source Robot Operating System , 2009, ICRA 2009.
[55] Todd Hester and Peter Stone. An Empirical Comparison of Abstraction in Models of Markov Decision Processes , 2009 .
[56] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[57] Jean Méhat,et al. A Parallel General Game Player , 2010, KI - Künstliche Intelligenz.
[58] Nassim Mafi,et al. Intrinsically motivated information foraging , 2010, 2010 IEEE 9th International Conference on Development and Learning.
[59] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[60] Peter Stone,et al. Real time targeted exploration in large domains , 2010, 2010 IEEE 9th International Conference on Development and Learning.
[61] Robert Babuska,et al. Control delay in Reinforcement Learning for real-time dynamic systems: A memoryless approach , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[62] Joel Veness,et al. A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..
[63] Peter Stone,et al. Structure Learning in Ergodic Factored MDPs without Knowledge of the Transition Function's In-Degree , 2011, ICML.
[64] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[65] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[66] Mausam,et al. LRTDP Versus UCT for Online Probabilistic Planning , 2012, AAAI.
[67] Richard S. Sutton,et al. Temporal-difference search in computer Go , 2012, Machine Learning.
[68] Peter Stone,et al. RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.