Quantile Credit Assignment
暂无分享,去创建一个
Georg Ostrovski | R. Munos | Mark Rowland | Will Dabney | T. Weber | É. Moulines | Clare Lyle | A. Gruslys | Alaa Saade | T. Mesnard | Yunhao Tang | M. Vaĺko | Wenqi Chen
[1] Georg Ostrovski,et al. Distributional Reinforcement Learning , 2023 .
[2] Matthew W. Hoffman,et al. Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach , 2022, ArXiv.
[3] Daniel Wontae Nam,et al. GMAC: A Distributional Perspective on Actor-Critic Framework , 2021, ICML.
[4] Marcus Hutter,et al. Counterfactual Credit Assignment in Model-Free Reinforcement Learning , 2020, ICML.
[5] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[6] Yang Guan,et al. Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors , 2020, IEEE Transactions on Neural Networks and Learning Systems.
[7] Doina Precup,et al. Hindsight Credit Assignment , 2019, NeurIPS.
[8] Yan Wu,et al. Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.
[9] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[10] Alexandre M. Bayen,et al. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.
[11] David Duvenaud,et al. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.
[12] Dengyong Zhou,et al. Action-depedent Control Variates for Policy Optimization via Stein's Identity , 2017 .
[13] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[14] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[15] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[16] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[17] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[18] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[21] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[22] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[23] P. Poupart,et al. Distributional Reinforcement Learning with Monotonic Splines , 2022, ICLR.
[24] Xingdong Feng,et al. Non-Crossing Quantile Regression for Distributional Reinforcement Learning , 2020, NeurIPS.
[25] R. Koenker,et al. Regression Quantiles , 2007 .
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.