Human-level control through deep reinforcement learning
暂无分享,去创建一个
Shane Legg | Alex Graves | Marc G. Bellemare | Joel Veness | Demis Hassabis | Koray Kavukcuoglu | Martin A. Riedmiller | David Silver | Daan Wierstra | Martin Riedmiller | Ioannis Antonoglou | Dharshan Kumaran | Volodymyr Mnih | Georg Ostrovski | Stig Petersen | Andrei A. Rusu | Helen King | Charles Beattie | Andrei A. Rusu | Marc G. Bellemare | Andreas K. Fidjeland | Amir Sadik | K. Kavukcuoglu | D. Hassabis | D. Silver | Georg Ostrovski | J. Veness | S. Legg | Volodymyr Mnih | A. Graves | A. Fidjeland | Stig Petersen | Charlie Beattie | A. Sadik | Ioannis Antonoglou | Helen King | D. Kumaran | Daan Wierstra | David Silver | Amir Sadik | Alex Graves
[1] D. Hubel,et al. Shape and arrangement of columns in cat's striate cortex , 1963, The Journal of physiology.
[2] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[3] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[4] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.
[5] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[6] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[7] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[9] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[10] N. Sigala,et al. Visual categorization shapes feature selectivity in the primate temporal cortex , 2002, Nature.
[11] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[12] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[13] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[14] Thomas Serre,et al. Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[15] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Michael R. Genesereth,et al. General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..
[18] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[19] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.
[20] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[21] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[22] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[23] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[24] E. Thorndike. Animal Intelligence; Experimental Studies , 2009 .
[25] C. Law,et al. Reinforcement learning can account for associative and perceptual learning on a visual decision task , 2009, Nature Neuroscience.
[26] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[27] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[28] J. O’Neill,et al. Play it again: reactivation of waking experience and memory , 2010, Trends in Neurosciences.
[29] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[30] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[31] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[32] Daniel Bendor,et al. Biasing the content of hippocampal replay during sleep , 2012, Nature Neuroscience.
[33] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..