Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
暂无分享,去创建一个
Mudit Verma | Lin Guan | Ruohan Zhang | Sihang Guo | Subbarao Kambhampati | Mudit Verma | L. Guan | Sihang Guo | Ruohan Zhang | Subbarao Kambhampati
[1] Mudit Verma,et al. Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable and Advisable AI Systems , 2021, AAAI.
[2] Pieter Abbeel,et al. PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training , 2021, ICML.
[3] Charles Blundell,et al. Representation Learning via Invariant Causal Mechanisms , 2020, ICLR.
[4] Anca D. Dragan,et al. Feature Expansive Reward Learning: Rethinking Human Input , 2020, 2021 16th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[5] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[6] Yuchen Cui,et al. The EMPATHIC Framework for Task Learning from Implicit Human Feedback , 2020, CoRL.
[7] Abhinav Gupta,et al. Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases , 2020, NeurIPS.
[8] Bo Liu,et al. Human Gaze Assisted Artificial Intelligence: A Review , 2020, IJCAI.
[9] Ilya Kostrikov,et al. Automatic Data Augmentation for Generalization in Deep Reinforcement Learning , 2020, ArXiv.
[10] Yasuo Kuniyoshi,et al. Using Human Gaze to Improve Robustness Against Irrelevant Objects in Robot Manipulation Tasks , 2020, IEEE Robotics and Automation Letters.
[11] P. Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[12] Scott Niekum,et al. Efficiently Guiding Imitation Learning Algorithms with Human Gaze , 2020, ArXiv.
[13] Radha Poovendran,et al. FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback , 2020, AAMAS.
[14] Kristian Kersting,et al. Making deep neural networks right for the right scientific reasons by interacting with their explanations , 2020, Nat. Mach. Intell..
[15] Chandan Singh,et al. Interpretations are useful: penalizing explanations to align neural networks with prior knowledge , 2019, ICML.
[16] Luxin Zhang,et al. Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset , 2019, ArXiv.
[17] Vladimir Aliev,et al. Free-Lunch Saliency via Attention in Atari Agents , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[18] Peter Stone,et al. Leveraging Human Guidance for Deep Reinforcement Learning Tasks , 2019, IJCAI.
[19] Taghi M. Khoshgoftaar,et al. A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.
[20] Michael L. Littman,et al. Deep Reinforcement Learning from Policy-Dependent Human Feedback , 2019, ArXiv.
[21] Kristian Kersting,et al. Explanatory Interactive Machine Learning , 2019, AIES.
[22] Marc G. Bellemare,et al. An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents , 2018, IJCAI.
[23] Yuta Tsuboi,et al. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback , 2018, ArXiv.
[24] Luxin Zhang,et al. AGIL: Learning Attention from Human for Visuomotor Tasks , 2018, ECCV.
[25] Jonathan Dodge,et al. Visualizing and Understanding Atari Agents , 2017, ICML.
[26] Peter Stone,et al. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.
[27] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[28] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[29] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[30] Andrew Slavin Ross,et al. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.
[31] Karen M. Feigh,et al. Learning From Explanations Using Sentiment and Advice in RL , 2017, IEEE Transactions on Cognitive and Developmental Systems.
[32] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[33] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[34] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[35] Andrea Lockerd Thomaz,et al. Policy Shaping with Human Teachers , 2015, IJCAI.
[36] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[37] David L. Roberts,et al. Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.
[38] David L. Roberts,et al. Learning something from nothing: Leveraging implicit human feedback strategies , 2014, The 23rd IEEE International Symposium on Robot and Human Interactive Communication.
[39] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[40] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[41] Peter Stone,et al. Reinforcement learning from simultaneous human and MDP reward , 2012, AAMAS.
[42] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[43] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[44] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.
[45] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[46] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[47] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[48] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[49] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[50] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[51] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.