暂无分享,去创建一个
Mohammad Ghavamzadeh | Aleksandra Faust | Ofir Nachum | Yinlam Chow | Edgar A. Duéñez-Guzmán | M. Ghavamzadeh | Ofir Nachum | Aleksandra Faust | Yinlam Chow
[1] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[2] Andreas Krause,et al. Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations , 2018, IEEE Robotics and Automation Letters.
[3] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[5] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[6] E. Altman. Constrained Markov Decision Processes , 1999 .
[7] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[9] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[12] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[13] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[14] Yaoliang Yu,et al. A General Projection Property for Distribution Families , 2009, NIPS.
[15] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[16] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[18] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[19] Eitan Altman,et al. Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program , 1998, Math. Methods Oper. Res..
[20] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[21] Lydia Tapia,et al. Continuous action reinforcement learning for control-affine systems with unknown dynamics , 2014, IEEE/CAA Journal of Automatica Sinica.
[22] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[23] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[24] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[25] Lydia Tapia,et al. PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[26] Aleksandra Faust,et al. Learning Navigation Behaviors End-to-End With AutoRL , 2018, IEEE Robotics and Automation Letters.
[27] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.
[28] James Davidson,et al. TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow , 2017, ArXiv.
[29] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[30] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[31] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.