Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains
暂无分享,去创建一个
Andrea Lockerd Thomaz | Aaron D. Lanterman | Kaushik Subramanian | Charles Lee Isbell | Luis C. Cobo | K. Subramanian | C. Isbell | A. Thomaz | A. Lanterman
[1] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[2] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[3] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.
[4] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[5] José María Valls,et al. Correcting and improving imitation models of humans for Robosoccer agents , 2005, 2005 IEEE Congress on Evolutionary Computation.
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Charles Elkan,et al. Learning classifiers from only positive and unlabeled data , 2008, KDD.
[8] Erik D. Demaine,et al. Classic Nintendo Games are (NP-)Hard , 2012, ArXiv.
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] Mark A. Hall,et al. Correlation-based Feature Selection for Machine Learning , 2003 .
[11] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[12] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[13] Peng Zhou,et al. Discovering options from example trajectories , 2009, ICML '09.
[14] N. Cowan,et al. The Magical Mystery Four , 2010, Current directions in psychological science.
[15] T. Wilson. Strangers to Ourselves: Discovering the Adaptive Unconscious , 2002 .
[16] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[17] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[18] Gerald Tesauro,et al. Programming backgammon using self-teaching neural nets , 2002, Artif. Intell..
[19] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[20] Tor Nørretranders,et al. The User Illusion: Cutting Consciousness Down to Size , 1998 .
[21] Shimon Whiteson,et al. EFFICIENT ABSTRACTION SELECTION IN REINFORCEMENT LEARNING , 2013, Comput. Intell..
[22] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[23] Erik Talvitie,et al. Simple Local Models for Complex Dynamical Systems , 2008, NIPS.
[24] Bart De Schutter,et al. Least-Squares Methods for Policy Iteration , 2012, Reinforcement Learning.
[25] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[26] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[27] Andrea Lockerd Thomaz,et al. Automatic State Abstraction from Demonstration , 2011, IJCAI.
[28] Lihong Li,et al. A worst-case comparison between temporal difference and residual gradient with linear function approximation , 2008, ICML '08.
[29] Rémi Gilleron,et al. Learning from positive and unlabeled examples , 2000, Theor. Comput. Sci..
[30] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[31] Zhi-Hua Zhou,et al. ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..
[32] Alborz Geramifard,et al. iLSTD: Eligibility Traces and Convergence Analysis , 2006, NIPS.
[33] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[34] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[35] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[36] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[37] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[38] Sridhar Mahadevan,et al. A reinforcement learning model of selective visual attention , 2001, AGENTS '01.
[39] Andrew G. Barto,et al. Automated State Abstraction for Options using the U-Tree Algorithm , 2000, NIPS.
[40] Honglak Lee,et al. Unsupervised learning of hierarchical representations with convolutional deep belief networks , 2011, Commun. ACM.
[41] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[42] Daniel A. Keim,et al. On Knowledge Discovery and Data Mining , 1997 .
[43] Shimon Whiteson,et al. Evolutionary Function Approximation for Reinforcement Learning , 2006, J. Mach. Learn. Res..
[44] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[45] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[46] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[47] P. Schönemann. On artificial intelligence , 1985, Behavioral and Brain Sciences.
[48] Andrew G. Barto,et al. Efficient skill learning using abstraction selection , 2009, IJCAI 2009.
[49] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.
[50] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[51] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[52] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[53] Pieter Abbeel,et al. Apprenticeship learning for helicopter control , 2009, CACM.
[54] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[55] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[56] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[57] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[58] Andrea Lockerd Thomaz,et al. Object focused q-learning for autonomous agents , 2013, AAMAS.
[59] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[60] Shimon Whiteson,et al. Switching between Representations in Reinforcement Learning , 2010, Interactive Collaborative Information Systems.
[61] Andrew G. Barto,et al. Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.
[62] Natalia H. Gardiol,et al. Learning with Deictic Representation , 2002 .
[63] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[64] Shimon Whiteson,et al. Efficient Abstraction Selection in Reinforcement Learning (Extended Abstract) , 2013, SARA.
[65] Lihong Li,et al. Reducing reinforcement learning to KWIK online regression , 2010, Annals of Mathematics and Artificial Intelligence.
[66] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[67] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[68] Andrea Lockerd Thomaz,et al. Automatic task decomposition and state abstraction from demonstration , 2012, AAMAS.
[69] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[70] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[71] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[72] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.