Safe Policy Learning for Continuous Control
暂无分享,去创建一个
Aleksandra Faust | Ofir Nachum | Yinlam Chow | Mohammad Ghavamzadeh | Edgar A. Duéñez-Guzmán | M. Ghavamzadeh | Ofir Nachum | Aleksandra Faust | Yinlam Chow
[1] Karthik Narasimhan,et al. Projection-Based Constrained Policy Optimization , 2020, ICLR.
[2] Aleksandra Faust,et al. Learning Navigation Behaviors End-to-End With AutoRL , 2018, IEEE Robotics and Automation Letters.
[3] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[4] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[5] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[6] Lydia Tapia,et al. PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[7] James Davidson,et al. TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow , 2017, ArXiv.
[8] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[9] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[10] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[11] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.
[12] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[13] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[14] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[16] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[17] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[18] Daniel King,et al. Fetch & Freight : Standard Platforms for Service Robot Applications , 2016 .
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[21] Lydia Tapia,et al. Continuous action reinforcement learning for control-affine systems with unknown dynamics , 2014, IEEE/CAA Journal of Automatica Sinica.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[24] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[25] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[26] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[27] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[28] E. Altman. Constrained Markov Decision Processes , 1999 .
[29] Eitan Altman,et al. Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program , 1998, Math. Methods Oper. Res..
[30] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[31] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .