暂无分享,去创建一个
Marcin Andrychowicz | Jakub Swiatkowski | Piotr Januszewski | Mateusz Olko | Michal Królikowski | Lukasz Kucinski | Piotr Milos | Marcin Andrychowicz | J. Swiatkowski | Piotr Milo's | M. Królikowski | Piotr Januszewski | Mateusz Olko | Lukasz Kuci'nski
[1] P. Abbeel,et al. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning , 2020, ICML.
[2] Marcin Andrychowicz,et al. What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study , 2020, ArXiv.
[3] Sergey Levine,et al. DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction , 2020, NeurIPS.
[4] Anoop Korattikara Balan,et al. Measuring the Reliability of Reinforcement Learning Algorithms , 2019, ICLR.
[5] K. Ross,et al. Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling , 2019, ICML.
[6] Larry Rudolph,et al. A Closer Look at Deep Policy Gradients , 2018, ICLR.
[7] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[8] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[9] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[10] K. Czechowski,et al. Uncertainty-sensitive Learning and Planning with Ensembles , 2019, arXiv.org.
[11] Balaji Lakshminarayanan,et al. Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.
[12] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[13] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[14] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[15] Chun Yuan,et al. Self-Adaptive Double Bootstrapped DDPG , 2018, IJCAI.
[16] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[17] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[18] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[19] Lei Cao,et al. Ensemble Network Architecture for Deep Reinforcement Learning , 2018 .
[20] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[21] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[22] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[23] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[24] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[25] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[26] Shuchang Zhou,et al. Learning to Run with Actor-Critic Ensemble , 2017, ArXiv.
[27] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[28] Benjamin Van Roy,et al. Why is Posterior Sampling Better than Optimism for Reinforcement Learning? , 2016, ICML.
[29] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[30] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[31] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[32] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[33] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[34] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[35] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[36] Marco Wiering,et al. Ensemble Algorithms in Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[37] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[38] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .