暂无分享,去创建一个
Guangwen Yang | Tianqi Zhao | Lintao Zhang | Zichuan Lin | Zichuan Lin | Tianqi Zhao | Guangwen Yang | Lintao Zhang
[1] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .
[2] Kilian Q. Weinberger,et al. Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 , 2016 .
[3] Philippe Preux,et al. Recent Advances in Reinforcement Learning , 2008, Lecture Notes in Computer Science.
[4] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[5] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[6] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[7] Smruti Amarjyoti. Deep Reinforcement Learning for Robotic Manipulation - The state of the art , 2017, ArXiv.
[8] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[9] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[10] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .
[11] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[12] Marc G. Bellemare,et al. Q($\lambda$) with Off-Policy Corrections , 2016 .
[13] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[14] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.
[15] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[16] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[17] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[18] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[19] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[20] J. M. BoardmanAbstract,et al. Contemporary Mathematics , 2007 .
[21] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[22] Yang Liu,et al. Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.
[23] BowlingMichael,et al. The arcade learning environment , 2013 .
[24] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[25] N. Daw,et al. Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework , 2017, Annual review of psychology.
[26] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[27] T. Robbins,et al. The hippocampal–striatal axis in learning, prediction and goal-directed behavior , 2011, Trends in Neurosciences.
[28] Daniel Gooch,et al. Communications of the ACM , 2011, XRDS.
[29] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[30] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[31] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.