Leveraging Human Guidance for Deep Reinforcement Learning Tasks
暂无分享,去创建一个
Peter Stone | Lin Guan | Dana H. Ballard | Ruohan Zhang | Faraz Torabi | D. Ballard | P. Stone | F. Torabi | L. Guan | Ruohan Zhang
[1] David L. Roberts,et al. Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.
[2] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[3] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[6] Andrea Palazzi,et al. Predicting the Driver's Focus of Attention: The DR(eye)VE Project , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Hsiu-Chin Lin,et al. Learning task constraints in operational space formulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[8] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[9] Peter Stone,et al. A social reinforcement learning agent , 2001, AGENTS '01.
[10] Peter Stone,et al. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.
[11] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[12] Sergey Levine,et al. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[13] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[14] Mohamed Medhat Gaber,et al. Imitation Learning , 2017, ACM Comput. Surv..
[15] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[16] Yuta Tsuboi,et al. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback , 2018, ArXiv.
[17] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[18] Alex S. Taylor,et al. Machine intelligence , 2009, CHI.
[19] Farbod Fahimi,et al. Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning , 2011, 2011 IEEE International Conference on Rehabilitation Robotics.
[20] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[22] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[23] Alan Fern,et al. A Bayesian Approach for Policy Learning from Trajectory Preference Queries , 2012, NIPS.
[24] Peter Stone,et al. Generative Adversarial Imitation from Observation , 2018, ArXiv.
[25] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..
[26] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[27] John Salvatier,et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning , 2017, ArXiv.
[28] Mo Yu,et al. Hybrid Reinforcement Learning with Expert State Sequences , 2019, AAAI.
[29] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[30] Catholijn M. Jonker,et al. Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making , 2018, AAMAS.
[31] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[32] Hsiu-Chin Lin,et al. The 2017 IEEE International Conference on Robotics and Automation (ICRA) , 2017 .
[33] Fiery Cushman,et al. Showing versus doing: Teaching by demonstration , 2016, NIPS.
[34] Peter Stone,et al. Reinforcement learning from simultaneous human and MDP reward , 2012, AAMAS.
[35] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[36] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[37] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[38] Eyke Hüllermeier,et al. Preference-based Evolutionary Direct Policy Search , 2013 .
[39] Peter Stone,et al. Imitation Learning from Video by Leveraging Proprioception , 2019, IJCAI.
[40] Johannes Fürnkranz,et al. Model-Free Preference-Based Reinforcement Learning , 2016, AAAI.
[41] T. Michael Knasel,et al. Robotics and autonomous systems , 1988, Robotics Auton. Syst..
[42] James M. Rehg,et al. In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video , 2018, ECCV.
[43] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[44] Peter Stone,et al. Adversarial Imitation Learning from State-only Demonstrations , 2019, AAMAS.
[45] Andrea Lockerd Thomaz,et al. Policy Shaping with Human Teachers , 2015, IJCAI.
[46] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[47] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[48] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[49] Eduardo F. Morales,et al. Dynamic Reward Shaping: Training a Robot by Voice , 2010, IBERAMIA.
[50] Jan Peters,et al. Sample and Feedback Efficient Hierarchical Reinforcement Learning from Human Preferences , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[51] Jonathan Tompson,et al. Learning Actionable Representations from Visual Observations , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[52] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[53] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[54] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[55] Luc Van Gool,et al. European conference on computer vision (ECCV) , 2006, eccv 2006.
[56] Peter Stone,et al. Recent Advances in Imitation Learning from Observation , 2019, IJCAI.
[57] Luxin Zhang,et al. Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset , 2019, ArXiv.
[58] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[59] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[60] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[61] Yannick Schroecker,et al. Imitating Latent Policies from Observation , 2018, ICML.
[62] Luxin Zhang,et al. AGIL: Learning Attention from Human for Visuomotor Tasks , 2018, ECCV.
[63] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[64] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[65] JOHN F. Young. Machine Intelligence , 1971, Nature.