From Adversarial Imitation Learning to Robust Batch Imitation Learning
暂无分享,去创建一个
[1] Anca D. Dragan,et al. On the Utility of Learning about Humans for Human-AI Coordination , 2019, NeurIPS.
[2] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[3] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[4] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[5] Ronald Kemker,et al. Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.
[6] Dale Schuurmans,et al. Striving for Simplicity in Off-policy Deep Reinforcement Learning , 2019, ArXiv.
[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[8] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[9] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[10] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[11] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[12] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[13] Christos Dimitrakakis,et al. Multi-View Decision Processes: The Helper-AI Problem , 2017, NIPS.
[14] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[15] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[16] Byron Boots,et al. Agile Autonomous Driving using End-to-End Deep Imitation Learning , 2017, Robotics: Science and Systems.
[17] Tetsuya Yohira,et al. Sample Efficient Imitation Learning for Continuous Control , 2018, ICLR.
[18] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[19] Siddhartha S. Srinivasa,et al. Effects of Robot Motion on Human-Robot Collaboration , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[20] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[21] Anca D. Dragan,et al. DART: Noise Injection for Robust Imitation Learning , 2017, CoRL.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[24] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[25] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[26] Kyunghyun Cho,et al. Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.
[27] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[28] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[29] Mikael Henaff,et al. Disagreement-Regularized Imitation Learning , 2020, ICLR.
[30] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[31] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[32] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[33] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[34] Deepak Pathak,et al. Learning to Generalize via Self-Supervised Prediction , 2019 .
[35] Balaraman Ravindran,et al. RAIL: Risk-Averse Imitation Learning , 2018, AAMAS.
[36] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[37] Germán Ros,et al. CARLA: An Open Urban Driving Simulator , 2017, CoRL.
[38] Charles A. Sutton,et al. VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.
[39] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[40] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[41] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[42] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[43] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[44] Ilya Kostrikov,et al. Imitation Learning via Off-Policy Distribution Matching , 2019, ICLR.
[45] Adam Lerer,et al. "Other-Play" for Zero-Shot Coordination , 2020, ICML.
[46] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[47] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[48] Martin Buss,et al. Human-Robot Collaboration: a Survey , 2008, Int. J. Humanoid Robotics.
[49] Kee-Eung Kim,et al. A Bayesian Approach to Generative Adversarial Imitation Learning , 2018, NeurIPS.
[50] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[51] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[52] Alexandros Kalousis,et al. Sample-Efficient Imitation Learning via Generative Adversarial Nets , 2018, AISTATS.
[53] Aran Nayebi,et al. CARMA : A Deep Reinforcement Learning Approach to Autonomous Driving , 2016 .
[54] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[55] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[56] Sergey Levine,et al. One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.
[57] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[58] Larry Rudolph,et al. Implementation Matters in Deep RL: A Case Study on PPO and TRPO , 2020, ICLR.
[59] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[60] Larry Rudolph,et al. A Closer Look at Deep Policy Gradients , 2018, ICLR.
[61] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[62] Yang Liu,et al. Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.
[63] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[64] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[65] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[66] Tanmay Gangwani,et al. State-only Imitation with Transition Dynamics Mismatch , 2020, ICLR.
[67] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[68] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[69] Sorin Grigorescu,et al. A Survey of Deep Learning Techniques for Autonomous Driving , 2020, J. Field Robotics.
[70] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[71] Jitendra Malik,et al. Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[72] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[73] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[74] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[75] Eder Santana,et al. Exploring the Limitations of Behavior Cloning for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[76] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[77] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[78] Anca D. Dragan,et al. Information gathering actions over human internal state , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).