Action Candidate Based Clipped Double Q-learning for Discrete and Continuous Action Tasks
暂无分享,去创建一个
Jian Yang | Jin Xie | Haobo Jiang | Jian Yang | Haobo Jiang | Jin Xie
[1] Hado van Hasselt,et al. Estimating the Maximum Expected Value: An Analysis of (Nested) Cross Validation and the Maximum Sample Average , 2013, ArXiv.
[2] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[3] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[7] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[8] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .
[10] Tian Tian,et al. MinAtar: An Atari-inspired Testbed for More Efficient Reinforcement Learning Experiments , 2019, ArXiv.
[11] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[12] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[13] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[14] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[15] Mykel J. Kochenderfer,et al. Weighted Double Q-learning , 2017, IJCAI.
[16] Marcello Restelli,et al. Estimating Maximum Expected Value through Gaussian Approximation , 2016, ICML.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Lawrence Carin,et al. Revisiting the Softmax Bellman Operator: New Benefits and New Perspective , 2018, ICML.
[19] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[20] András Lörincz,et al. The many faces of optimism: a unifying approach , 2008, ICML '08.