Safe Driving via Expert Guided Policy Optimization
暂无分享,去创建一个
Bolei Zhou | Chunxiao Liu | Quanyi Li | Zhenghao Peng | Bolei Zhou | Chunxiao Liu | Quanyi Li | Zhenghao Peng
[1] Improving the Generalization of End-to-End Driving through Procedural Generation , 2020, ArXiv.
[2] Pieter Abbeel,et al. Responsive Safety in Reinforcement Learning by PID Lagrangian Methods , 2020, ICML.
[3] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[4] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[5] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[6] Anca D. Dragan,et al. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.
[7] Bolei Zhou,et al. Learning a Decision Module by Imitating Driver's Control Behaviors , 2019, CoRL.
[8] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[9] Kyunghyun Cho,et al. Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.
[10] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[11] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[12] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[13] Mark R. Mine,et al. The Panda3D Graphics Engine , 2004, Computer.
[14] Harshit Sikchi,et al. Lyapunov Barrier Policy Optimization , 2021, ArXiv.
[15] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[16] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[17] John Salvatier,et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning , 2017, ArXiv.
[18] Florian Richter,et al. Open-Sourced Reinforcement Learning Environments for Surgical Robotics , 2019, ArXiv.
[19] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[20] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[21] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[22] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[23] Santiago Grijalva,et al. A Review of Reinforcement Learning for Autonomous Building Energy Management , 2019, Comput. Electr. Eng..
[24] Junhyuk Oh,et al. Balancing Constraints and Rewards with Meta-Gradient D4PG , 2021, ICLR.
[25] Bernard Widrow,et al. Pattern Recognition and Adaptive Control , 1964, IEEE Transactions on Applications and Industry.
[26] Katherine Rose Driggs-Campbell,et al. HG-DAgger: Interactive Imitation Learning with Human Experts , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[27] Bolei Zhou,et al. Neuro-Symbolic Program Search for Autonomous Driving Decision Module Design , 2020, CoRL.
[28] Sehoon Ha,et al. Learning to be Safe: Deep RL with a Safety Critic , 2020, ArXiv.
[29] David Janz,et al. Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[30] Yisong Yue,et al. Learning for Safety-Critical Control with Control Barrier Functions , 2019, L4DC.
[31] Bolei Zhou,et al. MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning , 2021, ArXiv.
[32] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[33] Dario Amodei,et al. Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .
[34] Chelsea Finn,et al. Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings , 2020, ICML.
[35] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.
[36] Sergey Levine,et al. Learning to Walk in the Real World with Minimal Human Effort , 2020, CoRL.
[37] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[38] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[39] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[40] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[41] Zoran Popovic,et al. Where to Add Actions in Human-in-the-Loop Reinforcement Learning , 2017, AAAI.
[42] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[43] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[44] S. Kambhampati,et al. Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning , 2020, ArXiv.
[45] Yongshuai Liu,et al. IPO: Interior-point Policy Optimization under Constraints , 2019, AAAI.
[46] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[47] Lantao Yu,et al. Adversarial Inverse Reinforcement Learning With Self-Attention Dynamics Model , 2021, IEEE Robotics and Automation Letters.
[48] Martial Hebert,et al. Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.