University of Groningen Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms Sabatelli,
暂无分享,去创建一个
[1] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[2] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[3] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[4] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[5] Wenhao Yu,et al. Supplementary material , 2015 .
[6] Marco Wiering,et al. The QV family compared to other reinforcement learning algorithms , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[7] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[8] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[9] Pieter Abbeel,et al. Towards Characterizing Divergence in Deep Q-Learning , 2019, ArXiv.
[10] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[11] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[12] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[13] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[14] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[15] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[16] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[17] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[18] Gilles Louppe,et al. Deep Quality-Value (DQV) Learning , 2019, BNAIC/BENELEARN.
[19] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[20] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[21] Marco Wiering. QV(lambda)-learning: A New On-policy Reinforcement Learning Algrithm , 2005 .
[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[23] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[26] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.