暂无分享,去创建一个
Dario Pavllo | Jonas Köhler | Dimitris Gkouletsos | Ziyad Sheebaelhamd | Konstantinos Zisis | Athina Nisioti | Dario Pavllo | Jonas Köhler | K. Zisis | Dimitris Gkouletsos | Ziyad Sheebaelhamd | Athina Nisioti | Konstantinos Zisis
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.
[3] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[4] Donald Goldfarb,et al. A numerically stable dual method for solving strictly convex quadratic programs , 1983, Math. Program..
[5] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[6] Vijay Kumar,et al. Learning Safe Unlabeled Multi-Robot Planning with Motion Constraints , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[7] Kim Peter Wabersich,et al. Linear Model Predictive Safety Certification for Learning-Based Control , 2018, 2018 IEEE Conference on Decision and Control (CDC).
[8] Barry Lennox,et al. Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.
[9] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[10] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[11] Mohammad Ghavamzadeh,et al. Lyapunov-based Safe Policy Optimization for Continuous Control , 2019, ArXiv.
[12] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[13] Osbert Bastani,et al. MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding , 2019, ArXiv.
[14] Etienne Perot,et al. Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.
[15] J. Maciejowski,et al. Soft constraints and exact penalty functions in model predictive control , 2000 .
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.
[18] Eitan Altman,et al. Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program , 1998, Math. Methods Oper. Res..
[19] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[20] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.