Learning to Weight Imperfect Demonstrations
暂无分享,去创建一个
Chang Xu | Chang Xu | Honglak Lee | Yunke Wang | Bo Du | Honglak Lee | Bo Du | Yunke Wang
[1] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[2] Stuart J. Russell,et al. Inverse reinforcement learning for video games , 2018, ArXiv.
[3] Bin Yang,et al. Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.
[4] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[5] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[6] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[7] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[8] Sergey Levine,et al. Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow , 2018, ICLR.
[9] Claude Sammut,et al. A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.
[10] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[11] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.
[12] Johannes Fürnkranz,et al. Model-Free Preference-Based Reinforcement Learning , 2016, AAAI.
[13] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[14] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[15] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[16] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[17] Stefano Ermon,et al. Learning Large-Scale Dynamic Discrete Choice Models of Spatio-Temporal Preferences with Application to Migratory Pastoralism in East Africa , 2015, AAAI.
[18] Masashi Sugiyama,et al. Imitation Learning from Imperfect Demonstration , 2019, ICML.
[19] Jürgen Schmidhuber,et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.
[20] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[21] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[22] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[23] Masashi Sugiyama,et al. Rethinking Importance Weighting for Deep Learning under Distribution Shift , 2020, NeurIPS.
[24] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[25] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[26] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[27] Scott Niekum,et al. Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations , 2019, CoRL.
[28] Huang Xiao,et al. Wasserstein Adversarial Imitation Learning , 2019, ArXiv.
[29] Siyuan Liu,et al. Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise , 2014, AAAI.
[30] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[31] Peter Englert,et al. Model-based imitation learning by probabilistic trajectory matching , 2013, 2013 IEEE International Conference on Robotics and Automation.
[32] Siddhartha Srinivasa,et al. Imitation Learning as f-Divergence Minimization , 2019, WAFR.
[33] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[34] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[35] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.
[36] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[37] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[38] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[39] Mohammad Emtiyaz Khan,et al. VILD: Variational Imitation Learning with Diverse-quality Demonstrations , 2019, ICML.
[40] Hiroaki Sugiyama,et al. Preference-learning based Inverse Reinforcement Learning for Dialog Control , 2012, INTERSPEECH.
[41] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[42] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.