A Constrained Multi-Objective Reinforcement Learning Framework
暂无分享,去创建一个
Sandy H. Huang | Yuval Tassa | Raia Hadsell | Steven Bohez | Martin A. Riedmiller | Abbas Abdolmaleki | Philemon Brakel | Nicolas Heess | Michael Neunert | Daniel J. Mankowitz | Giulia Vezzani | R. Hadsell | N. Heess | Yuval Tassa | A. Abdolmaleki | Michael Neunert | D. Mankowitz | Philemon Brakel | Steven Bohez | G. Vezzani
[1] Wojciech Matusik,et al. Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control , 2020, ICML.
[2] Danna Zhou,et al. d. , 1840, Microbial pathogenesis.
[3] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[4] D. Mankowitz,et al. An empirical investigation of the challenges of real-world reinforcement learning , 2020, ArXiv.
[5] Raia Hadsell,et al. Value constrained model-free continuous control , 2019, ArXiv.
[6] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[7] Yuriy Brun,et al. Preventing undesirable behavior of intelligent machines , 2019, Science.
[8] E. Altman. Constrained Markov Decision Processes , 1999 .
[9] Mohammad Ghavamzadeh,et al. Lyapunov-based Safe Policy Optimization for Continuous Control , 2019, ArXiv.
[10] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[11] Miss A.O. Penney. (b) , 1974, The New Yale Book of Quotations.
[12] Dario Amodei,et al. Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .
[13] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[14] Pieter Abbeel,et al. Responsive Safety in Reinforcement Learning by PID Lagrangian Methods , 2020, ICML.
[15] H. Francis Song,et al. A Distributional View on Multi-Objective Policy Optimization , 2020, ICML.
[16] Yiming Zhang,et al. First Order Optimization in Policy Space for Constrained Deep Reinforcement Learning , 2020, ArXiv.
[17] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[18] J. Dennis,et al. A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multicriteria optimization problems , 1997 .
[19] Nicolas Le Roux,et al. An operator view of policy gradient methods , 2020, NeurIPS.
[20] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[21] Bernhard Sendhoff,et al. On Test Functions for Evolutionary Multi-objective Optimization , 2004, PPSN.
[22] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[23] Gabriel Dulac-Arnold,et al. Challenges of Real-World Reinforcement Learning , 2019, ArXiv.
[24] David Levine,et al. Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.
[25] Miroslav Dudík,et al. Reinforcement Learning with Convex Constraints , 2019, NeurIPS.
[26] Karthik Narasimhan,et al. Projection-Based Constrained Policy Optimization , 2020, ICLR.
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[29] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[30] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[31] Mladen Kolar,et al. Convergent Policy Optimization for Safe Reinforcement Learning , 2019, NeurIPS.
[32] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[33] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[34] Alejandro Ribeiro,et al. Constrained Reinforcement Learning Has Zero Duality Gap , 2019, NeurIPS.
[35] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[36] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[37] Qiuyi Zhang,et al. Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization , 2020, ICML.
[38] Junhyuk Oh,et al. Balancing Constraints and Rewards with Meta-Gradient D4PG , 2020, ICLR.
[39] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[40] Jan Peters,et al. Manifold-based multi-objective policy search with sample reuse , 2017, Neurocomputing.
[41] Marcello Restelli,et al. Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation , 2016, J. Artif. Intell. Res..
[42] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.