Learning Options for an MDP from Demonstrations
暂无分享,去创建一个
Xiaodong Li | Fabio Zambetta | William L. Raffe | Marco Tamassia | Xiaodong Li | Fabio Zambetta | W. Raffe | M. Tamassia
[1] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[2] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[3] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.
[4] Andrea Lockerd Thomaz,et al. Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains , 2014, Artif. Intell..
[5] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[6] Andrew G. Barto,et al. Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.
[7] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[8] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[9] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[10] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[11] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[12] Pat Langley,et al. Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29 - July 2, 2000 , 2000, ICML 2000.
[13] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[14] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[15] Gaël Varoquaux,et al. The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.
[16] Matthieu Geist,et al. Batch, Off-Policy and Model-Free Apprenticeship Learning , 2011, EWRL.
[17] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.
[18] Lorenza Saitta,et al. Abstraction, Reformulation and Approximation , 2008 .
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Peter A. Flach,et al. Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.
[21] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[22] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[23] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[24] Oussama Khatib,et al. Experimental Robotics IV, The 4th International Symposium, Stanford, California, USA, June 30 - July 2, 1995 , 1997, ISER.
[25] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[26] François Laviolette,et al. Learning with Randomized Majority Votes , 2010, ECML/PKDD.
[27] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[28] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[29] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[30] Andrew Tridgell,et al. KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search , 1998, ICML.
[31] Libor Preucil,et al. European Robotics Symposium 2008 , 2008 .
[32] Leslie Pack Kaelbling,et al. Recent Advances in Reinforcement Learning , 1996, Springer US.