暂无分享,去创建一个
Prabhat Nagarajan | Scott Niekum | Daniel S. Brown | Wonjoon Goo | S. Niekum | P. Nagarajan | Wonjoon Goo
[1] Sergey Levine,et al. One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning , 2018, Robotics: Science and Systems.
[2] Stuart J. Russell,et al. Inverse reinforcement learning for video games , 2018, ArXiv.
[3] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[4] Markus Wulfmeier,et al. Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.
[5] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[6] Aude Billard,et al. Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.
[7] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[8] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[9] R. Luce,et al. Individual Choice Behavior: A Theoretical Analysis. , 1960 .
[10] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[11] Romain Laroche,et al. Score-based Inverse Reinforcement Learning , 2016, AAMAS.
[12] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[13] Joelle Pineau,et al. OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning , 2017, AAAI.
[14] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[15] Michael C. Yip,et al. Adversarial Imitation via Variational Inverse Reinforcement Learning , 2018, ICLR.
[16] Scott Niekum,et al. One-Shot Learning of Multi-Step Tasks from Observation via Activity Localization in Auxiliary Video , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[17] Michèle Sebag,et al. Preference-Based Policy Learning , 2011, ECML/PKDD.
[18] Prashant Doshi,et al. A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress , 2018, Artif. Intell..
[19] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[20] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[21] Katja Hofmann,et al. The Atari Grand Challenge Dataset , 2017, ArXiv.
[22] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[23] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[24] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[25] Hiroaki Sugiyama,et al. Preference-learning based Inverse Reinforcement Learning for Dialog Control , 2012, INTERSPEECH.
[26] Jonathan Dodge,et al. Visualizing and Understanding Atari Agents , 2017, ICML.
[27] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[28] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[29] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.
[30] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[31] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[32] Meng Joo Er,et al. A survey of inverse reinforcement learning techniques , 2012, Int. J. Intell. Comput. Cybern..
[33] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[34] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[35] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[36] Peter Stone,et al. Generative Adversarial Imitation from Observation , 2018, ArXiv.
[37] Johannes Fürnkranz,et al. Model-Free Preference-Based Reinforcement Learning , 2016, AAAI.
[38] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[39] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[40] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[41] Carlo Tomasi,et al. Distance Minimization for Reward Learning from Scored Trajectories , 2016, AAAI.
[42] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[43] Shimon Whiteson,et al. Inverse Reinforcement Learning from Failure , 2016, AAMAS.
[44] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[45] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[46] R. A. Bradley,et al. RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .
[47] Songhwai Oh,et al. Robust Learning From Demonstrations With Mixed Qualities Using Leveraged Gaussian Processes , 2019, IEEE Transactions on Robotics.
[48] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[49] R. Duncan Luce,et al. Individual Choice Behavior: A Theoretical Analysis , 1979 .
[50] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[51] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[52] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[53] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[54] Er Meng Joo,et al. A survey of inverse reinforcement learning techniques , 2012 .
[55] Sergey Levine,et al. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[56] Siyuan Liu,et al. Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise , 2014, AAAI.
[57] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.