暂无分享,去创建一个
David L. Roberts | Matthew E. Taylor | Michael L. Littman | Bei Peng | Robert Loftin | M. Littman | R. Loftin | Bei Peng | D. Roberts
[1] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[2] David L. Roberts,et al. Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.
[3] Michael Bloem,et al. Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[4] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[5] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[6] Cynthia Breazeal,et al. Training a Robot via Human Feedback: A Case Study , 2013, ICSR.
[7] Michael L. Littman,et al. Between Imitation and Intention Learning , 2015, IJCAI.
[8] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[9] Claude Sammut,et al. A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.
[10] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[11] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[12] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[13] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[14] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[15] Y. Benjamini,et al. THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .
[16] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[17] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[18] Monica C. Vroman. MAXIMUM LIKELIHOOD INVERSE REINFORCEMENT LEARNING , 2014 .
[19] Wolfram Burgard,et al. Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics , 2016, AISTATS.
[20] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[21] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[22] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.