暂无分享,去创建一个
Hyun Oh Song | Seungyong Moon | Gaon An | Jang-Hyun Kim | Hyun Oh Song | Seungyong Moon | Gaon An | Jang-Hyun Kim
[1] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[2] Martha White,et al. Selective Dyna-style Planning Under Limited Model Capacity , 2020, ICML.
[3] Richard Y. Chen,et al. UCB EXPLORATION VIA Q-ENSEMBLES , 2018 .
[4] Thorsten Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[5] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[6] Nitish Srivastava,et al. Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning , 2021, ICML.
[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[8] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[9] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[10] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[11] William R. Clements,et al. Estimating Risk and Uncertainty in Deep Reinforcement Learning , 2019, ArXiv.
[12] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[13] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[14] J. Royston. Expected Normal Order Statistics (Exact and Approximate) , 1982 .
[15] Robert Loftin,et al. Better Exploration with Optimistic Actor-Critic , 2019, NeurIPS.
[16] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[17] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[18] Martha White,et al. Maxmin Q-learning: Controlling the Estimation Bias of Q-learning , 2020, ICLR.
[19] David S. Watkins,et al. Understanding the $QR$ Algorithm , 1982 .
[20] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[21] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[22] Yuan Qi,et al. Generative Adversarial User Model for Reinforcement Learning Based Recommendation System , 2018, ICML.
[23] Pieter Abbeel,et al. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning , 2021, ICML.
[24] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[25] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[26] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[27] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[28] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.