暂无分享,去创建一个
Justinian Rosca | Peter J. Ramadge | Tsung-Yen Yang | Karthik Narasimhan | P. Ramadge | J. Rosca | Karthik Narasimhan | Tsung-Yen Yang
[1] Carlos V. Regueiro,et al. Learning on real robots from experience and simple user feedback , 2013 .
[2] Saso Dzeroski,et al. Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.
[3] Karthik Narasimhan,et al. Projection-Based Constrained Policy Optimization , 2020, ICLR.
[4] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[5] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[6] Pieter Abbeel,et al. Responsive Safety in Reinforcement Learning by PID Lagrangian Methods , 2020, ICML.
[7] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[8] Dorsa Sadigh,et al. When Humans Aren’t Optimal: Robots that Collaborate with Risk-Aware Humans , 2020, 2020 15th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[9] Aryan Mokhtari,et al. Escaping Saddle Points in Constrained Optimization , 2018, NeurIPS.
[10] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[11] Alexandre M. Bayen,et al. Benchmarks for reinforcement learning in mixed-autonomy traffic , 2018, CoRL.
[12] Javier García,et al. Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..
[13] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[14] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[15] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[16] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[17] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[18] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[19] Peter Stone,et al. Leveraging Human Guidance for Deep Reinforcement Learning Tasks , 2019, IJCAI.
[20] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[21] Masashi Sugiyama,et al. Imitation Learning from Imperfect Demonstration , 2019, ICML.
[22] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.
[23] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[24] Marc Teboulle,et al. Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions , 1993, SIAM J. Optim..
[25] Andrea Lockerd Thomaz,et al. Robot Learning from Human Teachers , 2014, Robot Learning from Human Teachers.
[26] Byron Boots,et al. Dual Policy Iteration , 2018, NeurIPS.
[27] Shimon Whiteson,et al. Neuroevolutionary reinforcement learning for generalized control of simulated helicopters , 2011, Evol. Intell..
[28] Brijen Thananjeyan,et al. On-Policy Robot Imitation Learning from a Converging Supervisor , 2019, CoRL.
[29] E. Altman. Constrained Markov Decision Processes , 1999 .
[30] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[31] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[32] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[33] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[34] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[35] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[36] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[37] Mohammad Ghavamzadeh,et al. Lyapunov-based Safe Policy Optimization for Continuous Control , 2019, ArXiv.
[38] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[39] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[40] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[41] John Salvatier,et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning , 2017, ArXiv.
[42] Joelle Pineau,et al. Benchmarking Batch Deep Reinforcement Learning Algorithms , 2019, ArXiv.
[43] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.
[44] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[45] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.