暂无分享,去创建一个
Alex Graves | Koray Kavukcuoglu | Martin A. Riedmiller | David Silver | Daan Wierstra | Ioannis Antonoglou | Volodymyr Mnih | K. Kavukcuoglu | D. Silver | Volodymyr Mnih | A. Graves | Ioannis Antonoglou | Daan Wierstra | David Silver | Alex Graves
[1] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[2] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .
[3] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[4] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[5] Jordan B. Pollack,et al. Why did TD-Gammon Work? , 1996, NIPS.
[6] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[7] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[8] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[10] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[11] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[12] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] Geoffrey E. Hinton. Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.
[15] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[16] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[17] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[18] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[19] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[20] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[21] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] Yee Whye Teh,et al. Actor-Critic Reinforcement Learning with Energy-Based Policies , 2012, EWRL.
[24] Marc G. Bellemare,et al. Sketch-Based Linear Value Function Approximation , 2012, NIPS.
[25] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Geoffrey E. Hinton,et al. Machine Learning for Aerial Image Labeling , 2013 .
[27] Marc G. Bellemare,et al. Bayesian Learning of Recursively Factored Environments , 2013, ICML.
[28] Yann LeCun,et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[29] Brian Kingsbury,et al. New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[30] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[31] Peter Kulchyski. and , 2015 .
[32] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..