Constrained Policy Optimization
暂无分享,去创建一个
Pieter Abbeel | David Held | Aviv Tamar | Joshua Achiam | P. Abbeel | Aviv Tamar | Joshua Achiam | David Held
[1] Moshe Haviv,et al. On constrained Markov decision processes , 1996, Oper. Res. Lett..
[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[3] E. Altman. Constrained Markov Decision Processes , 1999 .
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[6] E. Uchibe,et al. Constrained reinforcement learning from intrinsic and extrinsic rewards , 2007, 2007 IEEE 6th International Conference on Development and Learning.
[7] Stephen P. Boyd,et al. Subgradient Methods , 2007 .
[8] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[9] Imre Csiszár,et al. Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .
[10] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[11] Daniele Calandriello,et al. Safe Policy Iteration , 2013, ICML.
[12] E. Lukács. Probability and Mathematical Statistics: An Introduction , 2014 .
[13] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[14] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Eric Eaton,et al. Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret , 2015, ICML.
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[19] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[20] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[21] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[24] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[25] Pieter Abbeel,et al. Probabilistically safe policy transfer , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[26] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[27] Zachary Chase Lipton,et al. Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, 1611.01211.
[28] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[29] Sen Wang,et al. Deep Reinforcement Learning for Autonomous Driving , 2018, ArXiv.