Andrew G. Barto,et al. Monte Carlo Matrix Inversion and Reinforcement Learning , 1993, NIPS.
 Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.
 Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
 Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
 Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.
 Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
 Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
 Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.
 Jonathan Schaeffer,et al. Temporal Difference Learning Applied to a High-Performance Game-Playing Program , 2001, IJCAI.
 Haixun Wang,et al. Empirical comparison of various reinforcement learning strategies for sequential targeted marketing , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
 Markus Enzenberger. Evaluation in Go by a Neural Network using Soft Segmentation , 2003, ACG.
 Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
 Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
 Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
 Richard S. Sutton,et al. Learning to Predict by the Methods of Temporal Differences , 1988, Machine Learning.
 Olivier Teytaud,et al. Modiﬁcation of UCT with Patterns in Monte-Carlo Go , 2006 .
 Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
 Rémi Coulom. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
 Rémi Coulom,et al. Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength , 2008, Computers and Games.
 David Silver,et al. Reinforcement learning and simulation-based search in computer go , 2009 .
 N. Le Fort-Piat,et al. The world of independent learners is not markovian , 2011, Int. J. Knowl. Based Intell. Eng. Syst..
 David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
 Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
 David Silver,et al. Concurrent Reinforcement Learning from Customer Interactions , 2013, ICML.
 Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
 Bruno Scherrer. Approximate Policy Iteration Schemes: A Comparison , 2014, ICML.
 David Silver,et al. Move Evaluation in Go Using Deep Convolutional Neural Networks , 2014, ICLR.
 Matthew Lai. Giraffe: Using Deep Reinforcement Learning to Play Chess , 2015, ArXiv.
 Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
 Matthieu Geist,et al. Approximate modified policy iteration and its application to the game of Tetris , 2015, J. Mach. Learn. Res..
 Amos J. Storkey,et al. Training Deep Convolutional Neural Networks to Play Go , 2014, ICML.
 Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
 Yuandong Tian,et al. Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.
 Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
 Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
 Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
 David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
 David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
 Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
 Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
 Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2017, ICLR.