暂无分享,去创建一个
Zhenghao Peng | Bolei Zhou | Ziping Xu | Jiadong Guo | Hao Sun | Bo Dai | Meng Fang | Bolei Zhou | Ziping Xu | Meng Fang | Jiadong Guo | Zhenghao Peng | Bo Dai | Hao Sun
[1] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[2] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[3] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[4] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[5] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[6] Florian Richter,et al. Open-Sourced Reinforcement Learning Environments for Surgical Robotics , 2019, ArXiv.
[7] Yisong Yue,et al. Learning for Safety-Critical Control with Control Barrier Functions , 2019, L4DC.
[8] Harshit Sikchi,et al. Lyapunov Barrier Policy Optimization , 2021, ArXiv.
[9] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[10] Pieter Abbeel,et al. Responsive Safety in Reinforcement Learning by PID Lagrangian Methods , 2020, ICML.
[11] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[12] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.
[13] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[14] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[15] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[16] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[17] Don Joven Agravante,et al. Constrained Exploration and Recovery from Experience Shaping , 2018, ArXiv.
[18] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[19] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.
[20] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[21] Benjamin Rosman,et al. Online Constrained Model-based Reinforcement Learning , 2017, UAI.
[22] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[23] Sergey Levine,et al. Conservative Safety Critics for Exploration , 2021, ICLR.
[24] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[25] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[26] Dario Amodei,et al. Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .
[27] Sehoon Ha,et al. Learning to be Safe: Deep RL with a Safety Critic , 2020, ArXiv.
[28] David Janz,et al. Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[29] C. Karen Liu,et al. Visualizing Movement Control Optimization Landscapes , 2022, IEEE Transactions on Visualization and Computer Graphics.
[30] Vitaly Levdik,et al. Time Limits in Reinforcement Learning , 2017, ICML.
[31] Yongshuai Liu,et al. IPO: Interior-point Policy Optimization under Constraints , 2019, AAAI.
[32] Zheng Wen,et al. Efficient Exploration and Value Function Generalization in Deterministic Systems , 2013, NIPS.
[33] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[34] Yanan Sui,et al. Safe Reinforcement Learning in Constrained Markov Decision Processes , 2020, ICML.
[35] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[36] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[37] Sham M. Kakade,et al. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift , 2019, J. Mach. Learn. Res..
[38] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[39] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.