Learning to Play: Reinforcement Learning and Games
暂无分享,去创建一个
[1] H. P.,et al. Mathematical Recreations , 1944, Nature.
[2] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.
[3] Claude E. Shannon,et al. Programming a computer for playing chess , 1950 .
[4] C. S. Strachey,et al. Logical or non-mathematical programmes , 1952, ACM '52.
[5] Allen Newell,et al. Elements of a theory of human problem solving. , 1958 .
[6] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[7] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[8] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[9] D. Hubel,et al. Shape and arrangement of columns in cat's striate cortex , 1963, The Journal of physiology.
[10] Daniel Edwards,et al. The Alpha-Beta Heuristic , 1963 .
[11] R Bellman. ON THE APPLICATION OF DYNAMIC PROGRAMING TO THE DETERMINATION OF OPTIMAL PLAY IN CHESS AND CHECKERS. , 1965, Proceedings of the National Academy of Sciences of the United States of America.
[12] D. Michie. GAME-PLAYING AND GAME-LEARNING AUTOMATA , 1966 .
[13] Joseph Weizenbaum,et al. ELIZA—a computer program for the study of natural language communication between man and machine , 1966, CACM.
[14] Donald E. Eastlake,et al. The Greenblatt chess program , 1967, AFIPS '67 (Fall).
[15] D. Hubel,et al. Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.
[16] Barbara J Huberman,et al. A program to play chess end games , 1968 .
[17] Morton D. Davis. Game Theory: A Nontechnical Introduction , 1970 .
[18] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[19] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..
[20] Donald E. Knuth,et al. The art of computer programming: sorting and searching (volume 3) , 1973 .
[21] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[22] John Holland,et al. Adaptation in Natural and Artificial Sys-tems: An Introductory Analysis with Applications to Biology , 1975 .
[23] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.
[24] David B. Benson,et al. Life in the game of Go , 1976 .
[25] I. Witten. The apparent conflict between estimation and control—a survey of the two-armed bandit problem , 1976 .
[26] Hans J. Berliner,et al. Experiences in Evaluation with BKG - A Program that Plays Backgammon , 1977, IJCAI.
[27] Donald W. Loveland,et al. Automated theorem proving: a logical basis , 1978, Fundamental studies in computer science.
[28] A. Elo. The rating of chessplayers, past and present , 1978 .
[29] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[30] George C. Stockman,et al. A Minimax Algorithm Better than Alpha-Beta? , 1979, Artif. Intell..
[31] Larry S. Davis,et al. Pattern Databases , 1979, Data Base Design Techniques II.
[32] Judea Pearl,et al. SCOUT: A Simple Game-Searching Algorithm with Proven Optimal Properties , 1980, AAAI.
[33] Dana S. Nau. Pathology on Game Trees: A Summary of Results , 1980, AAAI.
[34] A. M. Turing,et al. Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.
[35] Hans J. Berliner,et al. Backgammon Computer Program Beats World Champion , 1980 .
[36] Aviezri S. Fraenkel,et al. Computing a Perfect Strategy for n*n Chess Requires Time Exponential in N , 1981, ICALP.
[37] T. Nitsche,et al. A LEARNING CHESS PROGRAM , 1982 .
[38] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.
[39] K. Coplan. A SPECIAL-PURPOSE MACHINE FOR AN IMPROVED SEARCH ALGORITHM FOR DEEP CHESS COMBINATIONS , 1982 .
[40] Bruno Buchberger,et al. Computer algebra symbolic and algebraic computation , 1982, SIGS.
[41] Judea Pearl,et al. On the Nature of Pathology in Game Searching , 1983, Artif. Intell..
[42] Bruce W. Ballard,et al. Non-Minimax Search Strategies for Use Against Fallible Opponents , 1983, AAAI.
[43] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .
[44] Bruce W. Ballard,et al. The *-Minimax Search Procedure for Trees Containing Chance Nodes , 1983, Artif. Intell..
[45] Donald F. Beal. Recent progress in understanding minimax search , 1983, ACM '83.
[46] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[47] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .
[48] John Philip Fishburn. Analysis of speedup in distributed algorithms , 1984 .
[49] Larry Wos,et al. Automated Reasoning: Introduction and Applications , 1984 .
[50] Lawrence J. Henschen,et al. What Is Automated Theorem Proving? , 1985, J. Autom. Reason..
[51] Gerald J. Sussman,et al. Structure and interpretation of computer programs , 1985, Proceedings of the IEEE.
[52] Ken Thompson,et al. Retrograde Analysis of Certain Endgames , 1986, J. Int. Comput. Games Assoc..
[53] G Schrüfer,et al. Presence and absence of pathology on game trees , 1986 .
[54] Rina Dechter,et al. Learning While Searching in Constraint-Satisfaction-Problems , 1986, AAAI.
[55] R. Geoff Dromey,et al. An algorithm for the selection problem , 1986, Softw. Pract. Exp..
[56] Jonathan Schaeffer,et al. Experiments in Search and Knowledge , 1986, J. Int. Comput. Games Assoc..
[57] Edward Hordern,et al. Sliding Piece Puzzles , 1987 .
[58] Ronald L. Rivest,et al. Game Tree Searching by Min/Max Approximation , 1987, Artif. Intell..
[59] Allen Newell,et al. SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..
[60] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..
[61] Ingo Althöfer,et al. Root Evaluation Errors: How they Arise and Propagate , 1988, J. Int. Comput. Games Assoc..
[62] Sarit Kraus,et al. Diplomat, an agent in a multi agent environment: An overview , 1988, Seventh Annual International Phoenix Conference on Computers an Communications. 1988 Conference Proceedings.
[63] Donald Michie,et al. Machine Learning in the Next Five Years , 1988, EWSL.
[64] Hermann Kaindl,et al. Minimaxing: Theory and Practice , 1988, AI Mag..
[65] David A. McAllester. Conspiracy Numbers for Min-Max Search , 1988, Artif. Intell..
[66] Jack J. Dongarra,et al. An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.
[67] D. Nau,et al. Comparison of the minimax and product back-up rules in a variety of games , 1988 .
[68] Dana S. Nau,et al. A general branch-and-bound formulation for and/or graph and game tree search , 1988 .
[69] Gerald Tesauro,et al. Neurogammon Wins Computer Olympiad , 1989, Neural Computation.
[70] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[71] Donald L. Iglehart,et al. Importance sampling for stochastic simulations , 1989 .
[72] Jonathan Schaeffer,et al. The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..
[73] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.
[74] Murray Campbell,et al. Singular Extensions: Adding Selectivity to Brute-Force Searching , 1990, Artif. Intell..
[75] Hiroaki Kitano,et al. Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..
[76] Craig A. Knoblock. Learning Abstraction Hierarchies for Problem Solving , 1990, AAAI.
[77] Donald F. Beal,et al. A Generalised Quiescence Search Algorithm , 1990, Artif. Intell..
[78] Wim Pijls,et al. Another View on the SSS* Algorithm , 1990, SIGAL International Symposium on Algorithms.
[79] Gerald Tesauro,et al. Neurogammon: a neural-network backgammon program , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[80] Albert L. Zobrist,et al. A New Hashing Method with Application for Game Playing , 1990 .
[81] Murray Campbell,et al. Experiments with the Null-Move Heuristic , 1990 .
[82] Ken Chen,et al. Smart game board and go explorer: a study in software and knowledge engineering , 1990, Commun. ACM.
[83] F. Hsu,et al. A Grandmaster Chess Machine , 1990 .
[84] Bruce Abramson,et al. Expected-Outcome: A General Model of Static Evaluation , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[85] R. A. Brooks,et al. Intelligence without Representation , 1991, Artif. Intell..
[86] Herbert D. Enderton. The Golem Go Program , 1991 .
[87] Dap Hartmann,et al. How Computers Play Chess , 1991, J. Int. Comput. Games Assoc..
[88] Austin Tate,et al. O-Plan: The open Planning Architecture , 1991, Artif. Intell..
[89] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[90] Sarit Kraus,et al. Negotiation in a non-cooperative environment , 1991, J. Exp. Theor. Artif. Intell..
[91] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..
[92] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[93] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[94] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[95] Jean-Christophe Weill. The NegaC* Search , 1992, J. Int. Comput. Games Assoc..
[96] Stuart C. Shapiro. The Turing Test and the economist , 1992, SGAR.
[97] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[98] Lorien Y. Pratt,et al. Discriminability-Based Transfer between Neural Networks , 1992, NIPS.
[99] Jonathan Schaeffer,et al. A World Championship Caliber Checkers Program , 1992, Artif. Intell..
[100] Jaap van den Herik,et al. Heuristic programming in Artificial Intelligence 3: the third computer olympiad , 1992 .
[101] Robert Lake,et al. Solving Large Retrograde Analysis Problems Using a Network of Workstations , 1993 .
[102] B. Rost,et al. Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.
[103] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.
[104] Christian Donninger,et al. Null Move and Deep Search , 1993, J. Int. Comput. Games Assoc..
[105] J. Elman. Learning and development in neural networks: the importance of starting small , 1993, Cognition.
[106] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[107] Thomas Bäck,et al. An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.
[108] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[109] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[110] L. V. Allis,et al. Searching for solutions in games and artificial intelligence , 1994 .
[111] Sebastian Thrun,et al. Learning to Play the Game of Chess , 1994, NIPS.
[112] H. Jaap van den Herik,et al. Proof-Number Search , 1994, Artif. Intell..
[113] David B. Fogel,et al. An introduction to simulated evolutionary optimization , 1994, IEEE Trans. Neural Networks.
[114] Elwyn R. Berlekamp,et al. Mathematical Go - chilling gets the last point , 1994 .
[115] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[116] Shigeki Iwata,et al. The Othello game on an n*n board is PSPACE-complete , 1994, Theor. Comput. Sci..
[117] Aske Plaat,et al. Solution Trees as a Basis for Game-Tree Search , 1994, J. Int. Comput. Games Assoc..
[118] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[119] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[120] Gerald Tesauro,et al. TD-Gammon: A Self-Teaching Backgammon Program , 1995 .
[121] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.
[122] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[123] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[124] Sebastian Thrun,et al. Explanation-based neural network learning a lifelong learning approach , 1995 .
[125] Dan Boneh,et al. On genetic algorithms , 1995, COLT '95.
[126] Luca Maria Gambardella,et al. Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.
[127] S. Yakowitz,et al. Machine learning and nonparametric bandit theory , 1995, IEEE Trans. Autom. Control..
[128] Jonathan Schaeffer,et al. CHINOOK: The World Man-Machine Checkers Champion , 1996, AI Mag..
[129] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[130] B. Pell. A STRATEGIC METAGAME PLAYER FOR GENERAL CHESS‐LIKE GAMES , 1994, Comput. Intell..
[131] M. Buro. Statistical Feature Combination for the Evaluation of Game Positions , 1995, J. Int. Comput. Games Assoc..
[132] Johannes Fürnkranz,et al. Machine Learning in Computer Chess: The Next Generation , 1996, J. Int. Comput. Games Assoc..
[133] Jonathan Schaeffer,et al. Best-First Fixed-Depth Minimax Algorithms , 1996, J. Int. Comput. Games Assoc..
[134] Jonathan Schaeffer,et al. New advances in Alpha-Beta searching , 1996, CSC '96.
[135] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[136] Jonathan Schaeffer,et al. Exploiting Graph Properties of Game Trees , 1996, AAAI/IAAI, Vol. 1.
[137] Jordan B. Pollack,et al. Why did TD-Gammon Work? , 1996, NIPS.
[138] M. Enzenberger. The Integration of A Priori Knowledge into a Go Playing Neural Network , 1996 .
[139] Jaeyoung Choi,et al. PB-BLAS: a set of parallel block basic linear algebra subprograms , 1996, Concurr. Pract. Exp..
[140] Richard E. Korf,et al. Finding Optimal Solutions to the Twenty-Four Puzzle , 1996, AAAI/IAAI, Vol. 2.
[141] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .
[142] Ralph Gasser,et al. SOLVING NINE MEN'S MORRIS , 1996, Comput. Intell..
[143] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[144] Aske Plaat,et al. Research, Re: Search and Re-Search , 1996, J. Int. Comput. Games Assoc..
[145] Aske Plaat,et al. Programming Parallel Applications In Cilk , 1997 .
[146] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[147] Monty Newborn,et al. Crafty Goes Deep , 1997, J. Int. Comput. Games Assoc..
[148] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[149] Jonathan Schaeffer,et al. Search Versus Knowledge in Game-Playing Programs Revisited , 1997, IJCAI.
[150] Avi Pfeffer,et al. Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..
[151] Michael Buro. Experiments with Multi-ProbCut and a New High-Quality Evaluation Function for Othello , 1997 .
[152] Mark Brockington. KEYANO Unplugged -- The Construction of an Othello Program , 1997 .
[153] Jonathan Schaeffer,et al. Kasparov versus Deep Blue: The Rematch , 1997, J. Int. Comput. Games Assoc..
[154] Luca Maria Gambardella,et al. Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..
[155] Michael Buro,et al. The Othello Match of the Year: Takeshi Murakami vs. Logistello , 1997, J. Int. Comput. Games Assoc..
[156] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[157] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[158] Dana S. Nau,et al. Computer Bridge - A Big Win for AI Planning , 1998, AI Mag..
[159] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[160] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[161] Jonathan Schaeffer,et al. Opponent Modeling in Poker , 1998, AAAI/IAAI.
[162] J. Searle. Mind, Language, And Society: Philosophy In The Real World , 1998 .
[163] Lutz Prechelt,et al. Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.
[164] Jonathan Baxter. KnightCap : A chess program that learns by combining TD ( ) with game-tree search , 1998 .
[165] Alexander J. Smola,et al. Learning with kernels , 1998 .
[166] Michael I. Jordan. Graphical Models , 2003 .
[167] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[168] David B. Fogel,et al. Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.
[169] Kate Smith-Miles,et al. Neural Networks for Combinatorial Optimization: A Review of More Than a Decade of Research , 1999, INFORMS J. Comput..
[170] Doina Precup,et al. Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .
[171] X. Yao. Evolving Artificial Neural Networks , 1999 .
[172] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[173] Donald F. Beal,et al. Learning Piece-square Values using Temporal Differences , 1999, J. Int. Comput. Games Assoc..
[174] Ernst A. Heinz. Adaptive Null-Move Pruning , 1999, J. Int. Comput. Games Assoc..
[175] Ken Chen,et al. Static Analysis of Life and Death in the Game of Go , 1999, Inf. Sci..
[176] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[177] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[178] Frank Dignum,et al. Deliberative Normative Agents: Principles and Architecture , 1999, ATAL.
[179] Rich Caruana,et al. Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.
[180] Tsan-sheng Hsu,et al. Construction of Chinese Chess Endgame Databases by Retrograde Analysis , 2000, Computers and Games.
[181] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[182] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[183] Jonathan Schaeffer,et al. Unifying single-agent and two-player search , 2000, Inf. Sci..
[184] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[185] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[186] Guido Rossum,et al. Python Reference Manual , 2000 .
[187] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.
[188] Donald F. Beal,et al. Temporal Difference Learning for Heuristic Search and Game Playing , 2000, Inf. Sci..
[189] Richard E. Korf,et al. Recent Progress in the Design and Analysis of Admissible Heuristic Functions , 2000, AAAI/IAAI.
[190] Ernst A. Heinz,et al. New Self-Play Results in Computer Chess , 2000, Computers and Games.
[191] Erik van der Werf,et al. AI techniques for the game of Go , 2001 .
[192] Teun Koetsier,et al. On the prehistory of programmable machines: musical automata, looms, calculators , 2001 .
[193] E. Vesterinen,et al. Affective Computing , 2009, Encyclopedia of Biometrics.
[194] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[195] T. Patterson. The roots of āyurveda: selections from Sanskrit medical writings , 2001, Medical History.
[196] A. Giotis,et al. LOW-COST STOCHASTIC OPTIMIZATION FOR ENGINEERING APPLICATIONS , 2002 .
[197] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[198] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..
[199] Jose Miguel Puerta,et al. Ant colony optimization for learning Bayesian networks , 2002, Int. J. Approx. Reason..
[200] Barbara Webb,et al. Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..
[201] Michael Buro,et al. The evolution of strong othello programs , 2002, IWEC.
[202] Eric O. Postma,et al. Local Move Prediction in Go , 2002, Computers and Games.
[203] Gerald Tesauro,et al. Programming backgammon using self-teaching neural nets , 2002, Artif. Intell..
[204] Jürgen Schmidhuber,et al. Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM , 2002, Neural Computation.
[205] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..
[206] Tracy Brown,et al. The Embodied Mind: Cognitive Science and Human Experience , 2002, Cybern. Hum. Knowing.
[207] Feng-Hsiung Hsu,et al. Behind Deep Blue: Building the Computer that Defeated the World Chess Champion , 2002 .
[208] Noriyuki Kobayashi,et al. Cooperation and competition of agents in the auction of computer bridge , 2003 .
[209] Henri E. Bal,et al. Solving awari with parallel retrograde analysis , 2003, Computer.
[210] Kenji Doya,et al. Meta-learning in Reinforcement Learning , 2003, Neural Networks.
[211] Masakazu Matsugu,et al. Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.
[212] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.
[213] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[214] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[215] Monty Newborn,et al. Deep Blue - an artificial intelligence milestone , 2012 .
[216] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.
[217] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[218] Peter Van Roy,et al. Concepts, Techniques, and Models of Computer Programming , 2004 .
[219] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[220] David Maxwell Chickering,et al. Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.
[221] Cli McMahon,et al. Machines Who Think : A Personal Inquiry into the History and Prospects of Artificial Intelligence , 2004 .
[222] R. Duke,et al. Policy games for strategic management , 2004 .
[223] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[224] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[225] Thomas Stützle,et al. Stochastic Local Search: Foundations & Applications , 2004 .
[226] Keechul Jung,et al. GPU implementation of neural networks , 2004, Pattern Recognit..
[227] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[228] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[229] Frank van Harmelen,et al. A semantic web primer , 2004 .
[230] Jonathan Schaeffer,et al. Rediscovering *-Minimax Search , 2004, Computers and Games.
[231] Jonathan Schaeffer,et al. Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.
[232] Timothy Huang,et al. Experiments with learning opening strategy in the game of go , 2004, Int. J. Artif. Intell. Tools.
[233] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.
[234] A. Ng. Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.
[235] C. Koch. The quest for consciousness : a neurobiological approach , 2004 .
[236] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[237] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[238] Greg Lindstrom,et al. Programming with Python , 2005, IT Professional.
[239] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[240] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[241] Isabelle Bichindaritz,et al. Medical applications in case-based reasoning , 2005, The Knowledge Engineering Review.
[242] Jürgen Schmidhuber,et al. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.
[243] Johan Håstad,et al. On the power of small-depth threshold circuits , 1991, computational complexity.
[244] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[245] Bart Demoen,et al. Programming in Prolog. Using the ISO Standard. by William F. Clocksin, Christopher S. Mellish, Springer-Verlag, 2003, ISBN 3-540-00678-8, xiii+299 pages , 2005, Theory and Practice of Logic Programming.
[246] Ivan Bratko,et al. Bias and pathology in minimax search , 2005, Theor. Comput. Sci..
[247] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.
[248] Dana S. Nau,et al. Experiments on alternatives to minimax , 2005, International Journal of Parallel Programming.
[249] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..
[250] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[251] David B. Fogel,et al. Further Evolution of a Self-Learning Chess Program , 2005, CIG.
[252] Tristan Cazenave,et al. Combining Tactical Search and Monte-Carlo in the Game of Go , 2005, CIG.
[253] Michael R. Genesereth,et al. General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..
[254] Philipp Slusallek,et al. Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.
[255] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[256] Daphne Koller,et al. Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.
[257] David B. Fogel,et al. The Blondie25 Chess Program Competes Against Fritz 8.0 and a Human Chess Master , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.
[258] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[259] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.
[260] Catholijn M. Jonker,et al. An agent architecture for multi-attribute negotiation using incomplete preference information , 2007, Autonomous Agents and Multi-Agent Systems.
[261] Rich Caruana,et al. Model compression , 2006, KDD '06.
[262] Jürgen Schmidhuber,et al. Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .
[263] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[264] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[265] Gregory Chaitin,et al. The limits of reason. , 2006, Scientific American.
[266] John Tromp,et al. Combinatorics of Go , 2006, Computers and Games.
[267] Thore Graepel,et al. Bayesian pattern ranking for move prediction in the game of Go , 2006, ICML.
[268] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[269] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[270] Shane Legg,et al. A Collection of Definitions of Intelligence , 2007, AGI.
[271] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[272] James H. Moor,et al. The Dartmouth College Artificial Intelligence Conference: The Next Fifty Years , 2006, AI Mag..
[273] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[274] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.
[275] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[276] Rémi Coulom. Monte-Carlo Tree Search in Crazy Stone , 2007 .
[277] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[278] Stephan Schiffel,et al. Fluxplayer: A Successful General Game Player , 2007, AAAI.
[279] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[280] Shi-Chun Tsai,et al. On the fairness and complexity of generalized k-in-a-row games , 2007, Theor. Comput. Sci..
[281] Tom M. Mitchell,et al. The Need for Biases in Learning Generalizations , 2007 .
[282] International Foundation for Autonomous Agents and MultiAgent Systems ( IFAAMAS ) , 2007 .
[283] Eric A. Hansen,et al. Anytime Heuristic Search , 2011, J. Artif. Intell. Res..
[284] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.
[285] Joost Broekens,et al. Emotion and Reinforcement: Affective Facial Expressions Facilitate Robot Learning , 2007, Artifical Intelligence for Human Computing.
[286] Richard E. Neapolitan,et al. Learning Bayesian networks , 2007, KDD '07.
[287] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[288] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.
[289] Mark Harman,et al. The Current State and Future of Search Based Software Engineering , 2007, Future of Software Engineering (FOSE '07).
[290] Mauro Birattari,et al. Swarm Intelligence , 2012, Lecture Notes in Computer Science.
[291] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[292] Jürgen Schmidhuber,et al. An Application of Recurrent Neural Networks to Discriminative Keyword Spotting , 2007, ICANN.
[293] Johannes Fürnkranz,et al. Learning of Piece Values for Chess Variants , 2008 .
[294] Pieter Spronck,et al. Monte-Carlo Tree Search: A New Framework for Game AI , 2008, AIIDE.
[295] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[296] Nathan S. Netanyahu,et al. Genetic algorithms for mentor-assisted evaluation function optimization , 2008, GECCO '08.
[297] H. Jaap van den Herik,et al. Parallel Monte-Carlo Tree Search , 2008, Computers and Games.
[298] Leslie G. Valiant,et al. Knowledge Infusion: In Pursuit of Robustness in Artificial Intelligence , 2008, FSTTCS.
[299] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[300] Tibor Bosse,et al. Formalisation of Damasio’s theory of emotion, feeling and core consciousness , 2008, Consciousness and Cognition.
[301] Sunita Sarawagi. Learning with Graphical Models , 2008 .
[302] Ilya Sutskever,et al. Mimicking Go Experts with Convolutional Neural Networks , 2008, ICANN.
[303] H. Jaap van den Herik,et al. Single-Player Monte-Carlo Tree Search , 2008, Computers and Games.
[304] David Silver,et al. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .
[305] Jonathan Schaeffer,et al. One Jump Ahead: Computer Perfection at Checkers , 2008 .
[306] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[307] J. Huizinga. Homo ludens : proeve eener bepaling van het spel-element der cultuur , 2008 .
[308] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[309] Yngvi Björnsson,et al. CadiaPlayer: A Simulation-Based General Game Player , 2009, IEEE Transactions on Computational Intelligence and AI in Games.
[310] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[311] Joel Veness,et al. Bootstrapping from Game Tree Search , 2009, NIPS.
[312] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[313] Nils J. Nilsson,et al. The Quest for Artificial Intelligence , 2009 .
[314] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[315] J. Broekens,et al. Assistive social robots in elderly care: a review , 2009 .
[316] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[317] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[318] Marco Scutari,et al. Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.
[319] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[320] Luís Seabra Lopes,et al. DarkBlade: A Program That Plays Diplomacy , 2009, EPIA.
[321] Burr Settles,et al. Active Learning Literature Survey , 2009 .
[322] David P. Helmbold,et al. All-Moves-As-First Heuristics in Monte-Carlo Go , 2009, IC-AI.
[323] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.
[324] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[325] Martin Mueller. Fuego at the Computer Olympiad in Pamplona 2009: A Tournament Report , 2009 .
[326] Pieter Spronck,et al. Monte-Carlo Tree Search in Settlers of Catan , 2009, ACG.
[327] Mark H. M. Winands,et al. Quiescence Search for Stratego , 2009 .
[328] Ricardo Vilalta,et al. Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.
[329] Dana S. Nau,et al. Error Minimizing Minimax : Avoiding Search Pathology in Game Trees , 2009 .
[330] Kai A. Krueger,et al. Flexible shaping: How learning in small steps helps , 2009, Cognition.
[331] Daniel Michulke,et al. Neural Networks for State Evaluation in General Game Playing , 2009, ECML/PKDD.
[332] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[333] Martin Müller,et al. A Lock-Free Multithreaded Monte-Carlo Tree Search Algorithm , 2009, ACG.
[334] Michael Thielscher. Answer Set Programming for Single-Player Games in General Game Playing , 2009, ICLP.
[335] David Silver,et al. Reinforcement Learning and Simulation Based Search in the Game of Go , 2009 .
[336] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[337] Julian Togelius,et al. Multiobjective exploration of the StarCraft map space , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.
[338] Michael Thielscher,et al. A General Game Description Language for Incomplete Information Games , 2010, AAAI.
[339] Yngvi Björnsson,et al. Learning Simulation Control in General Game-Playing Agents , 2010, AAAI.
[340] Sarit Kraus,et al. Can automated agents proficiently negotiate with humans? , 2010, CACM.
[341] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[342] Marco Wiering. Self-Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning , 2010, J. Intell. Learn. Syst. Appl..
[343] J. O’Neill,et al. Play it again: reactivation of waking experience and memory , 2010, Trends in Neurosciences.
[344] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[345] Leslie Pack Kaelbling,et al. Hierarchical Planning in the Now , 2010, Bridging the Gap Between Task and Motion Planning.
[346] Nicola Beume,et al. Towards Intelligent Team Composition and Maneuvering in Real-Time Strategy Games , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[347] Julian Togelius,et al. Search-Based Procedural Content Generation , 2010, EvoApplications.
[348] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[349] Hendrik Baier,et al. The Power of Forgetting: Improving the Last-Good-Reply Policy in Monte Carlo Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[350] Jie Cheng,et al. CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..
[351] Martin Müller,et al. Fuego—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[352] Tuomas Sandholm,et al. The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..
[353] Li Fei-Fei,et al. ImageNet: Constructing a large-scale image database , 2010 .
[354] Julien Kloetzer. Monte-Carlo Opening Books for Amazons , 2010, Computers and Games.
[355] Ryan B. Hayward,et al. Monte Carlo Tree Search in Hex , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[356] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[357] Thomas Bartz-Beielstein,et al. Experimental Methods for the Analysis of Optimization Algorithms , 2010 .
[358] Bart Selman,et al. On Adversarial Search Spaces and Sampling-Based Planning , 2010, ICAPS.
[359] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[360] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[361] Olivier Teytaud,et al. Special Issue on Monte Carlo Techniques and Computer Go , 2010, IEEE Trans. Comput. Intell. AI Games.
[362] Fons J. Verbeek,et al. Pattern Recognition for High Throughput Zebrafish Imaging Using Genetic Algorithm Optimization , 2010, PRIB.
[363] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[364] Richard B. Segal,et al. On the Scalability of Parallel UCT , 2010, Computers and Games.
[365] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[366] Shang-Rong Tsai,et al. Current Frontiers in Computer Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[367] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[368] Petr Baudis,et al. Balancing MCTS by Dynamically Adjusting the Komi Value , 2011, J. Int. Comput. Games Assoc..
[369] Petr Baudis,et al. PACHI: State of the Art Open Source Go Program , 2011, ACG.
[370] Damien Pellier,et al. MCTS Experiments on the Voronoi Game , 2011, ACG.
[371] Ian D. Watson,et al. Computer poker: A review , 2011, Artif. Intell..
[372] Alan Fern,et al. Ensemble Monte-Carlo Planning: An Empirical Study , 2011, ICAPS.
[373] Mohamed Chtourou,et al. On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.
[374] Arno J. Knobbe,et al. Non-redundant Subgroup Discovery in Large and Complex Data , 2011, ECML/PKDD.
[375] Richard J. Lorentz. Experiments with Monte-Carlo Tree Search in the Game of Havannah , 2011, J. Int. Comput. Games Assoc..
[376] Stefan Schaal,et al. Hierarchical reinforcement learning with movement primitives , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.
[377] Digital Computers Applied to Games. Faster Than Thought , 2011 .
[378] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.
[379] Christopher D. Rosin,et al. Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.
[380] Perry R. Cook,et al. Real-time human interaction with supervised learning algorithms for music composition and performance , 2011 .
[381] Tuomas Sandholm,et al. Game theory-based opponent modeling in large imperfect-information games , 2011, AAMAS.
[382] H. V. van Vlijmen,et al. Which Compound to Select in Lead Optimization? Prospectively Validated Proteochemometric Models Guide Preclinical Development , 2011, PloS one.
[383] Hrafn Eiríksson,et al. Investigation of Multi-Cut Pruning in Game-Tree Search , 2011 .
[384] Kamil Rocki,et al. Large-Scale Parallel Monte Carlo Tree Search on GPU , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[385] Sabine Kastner,et al. Human consciousness and its relationship to social neuroscience: A novel hypothesis , 2011, Cognitive neuroscience.
[386] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[387] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[388] Richard J. Lorentz. An MCTS Program to Play EinStein Würfelt Nicht! , 2011, ACG.
[389] Maarten Sierhuis,et al. Beyond Cooperative Robotics: The Central Role of Interdependence in Coactive Design , 2011, IEEE Intelligent Systems.
[390] G. Kalyanaram,et al. Nudge: Improving Decisions about Health, Wealth, and Happiness , 2011 .
[391] Michael Thielscher. The General Game Playing Description Language Is Universal , 2011, IJCAI.
[392] Huajun Chen,et al. The Semantic Web , 2011, Lecture Notes in Computer Science.
[393] Lutz Prechelt,et al. Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.
[394] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[395] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[396] Michael Thielscher,et al. HyperPlay: A Solution to General Game Playing with Imperfect Information , 2012, AAAI.
[397] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[398] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[399] Razvan Pascanu,et al. Theano: Deep Learning on GPUs with Python , 2012 .
[400] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[401] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[402] Ronald Parr,et al. Greedy Algorithms for Sparse Reinforcement Learning , 2012, ICML.
[403] Julian Togelius,et al. The Mario AI Benchmark and Competitions , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[404] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[405] Richard S. Sutton,et al. Temporal-difference search in computer Go , 2012, Machine Learning.
[406] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[407] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[408] Yee Whye Teh,et al. Actor-Critic Reinforcement Learning with Energy-Based Policies , 2012, EWRL.
[409] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[410] Dimitri P. Bertsekas,et al. Rollout Algorithms for Discrete Optimization: A Survey , 2012 .
[411] Simon Colton,et al. The Painting Fool: Stories from Building an Automated Painter , 2012 .
[412] Alex Graves,et al. Supervised Sequence Labelling , 2012 .
[413] Song Yu,et al. SPARSE MATRIX-VECTOR MULTIPLICATION ON NVIDIA GPU , 2012 .
[414] Michael Johanson,et al. Measuring the Size of Large No-Limit Poker Games , 2013, ArXiv.
[415] Christopher Archibald,et al. Monte Carlo *-Minimax Search , 2013, IJCAI.
[416] Marco Wiering,et al. Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[417] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[418] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[419] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[420] Carlos Cotta,et al. A review of computational intelligence in RTS games , 2013, 2013 IEEE Symposium on Foundations of Computational Intelligence (FOCI).
[421] Michel Gendreau,et al. Hyper-heuristics: a survey of the state of the art , 2013, J. Oper. Res. Soc..
[422] Martin Zinkevich,et al. The Annual Computer Poker Competition , 2013, AI Mag..
[423] Sarit Kraus,et al. Evaluating practical negotiating agents: Results and analysis of the 2011 international competition , 2013, Artif. Intell..
[424] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[425] Alex Alves Freitas,et al. Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms , 2013, Genetic Programming and Evolvable Machines.
[426] Édouard Bonnet,et al. On the Complexity of Trick-Taking Card Games , 2013, IJCAI.
[427] Qiang Yang,et al. Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.
[428] Santiago Ontañón,et al. A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.
[429] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[430] Marc G. Bellemare,et al. Bayesian Learning of Recursively Factored Environments , 2013, ICML.
[431] H. Jaap van den Herik,et al. Improving multivariate Horner schemes with Monte Carlo tree search , 2012, Comput. Phys. Commun..
[432] Kevin Leyton-Brown,et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.
[433] I. Good. A FIVE-YEAR PLAN FOR AUTOMATIC CHESS , 2013 .
[434] Shih-Chieh Huang,et al. MoHex 2.0: A Pattern-Based MCTS Hex Player , 2013, Computers and Games.
[435] H. Jaap van den Herik,et al. Investigations with Monte Carlo Tree Search for Finding Better Multivariate Horner Schemes , 2013, ICAART.
[436] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[437] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[438] Zhiwei Qin,et al. Sparse Reinforcement Learning via Convex Optimization , 2014, ICML.
[439] H. Jaap van den Herik,et al. HEPGAME and the Simplification of Expressions , 2014, ArXiv.
[440] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[441] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[442] P. Baldi,et al. Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.
[443] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[444] Jonathan Schaeffer,et al. A New Paradigm for Minimax Search , 2014, ArXiv.
[445] H. Jaap van den Herik,et al. Genetic Algorithms for Evolving Computer Chess Programs , 2014, IEEE Transactions on Evolutionary Computation.
[446] Jack J. Dongarra,et al. Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores , 2014, ICS '14.
[447] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[448] Vedran Dunjko,et al. Quantum speedup for active learning agents , 2014, 1401.4997.
[449] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[450] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[451] Sarit Kraus,et al. GENIUS: AN INTEGRATED ENVIRONMENT FOR SUPPORTING THE DESIGN OF GENERIC AUTOMATED NEGOTIATORS , 2012, Comput. Intell..
[452] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[453] Peter A. Flach,et al. Subgroup Discovery in Smart Electricity Meter Data , 2014, IEEE Transactions on Industrial Informatics.
[454] Steven te Brinke,et al. Monte Carlo Tree Search , 2014 .
[455] Diego Klabjan,et al. Skill-based differences in spatio-temporal team behaviour in defence of the Ancients 2 (DotA 2) , 2014, 2014 IEEE Games Media Entertainment.
[456] D. Hambrick,et al. On Behalf Of: Association for Psychological Science Onlinefirst Version of Record >> Deliberate Practice and Performance in Music, Games, Sports, Education, and Professions: a Meta-analysis the Current Meta-analysis Effect Sizes Meta-analytic Procedure , 2022 .
[457] H. Jaap van den Herik,et al. Combining Simulated Annealing and Monte Carlo Tree Search for Expression Simplification , 2013, ICAART.
[458] Rasoul Karimi,et al. Active Learning for Recommender Systems , 2015, KI - Künstliche Intelligenz.
[459] Hesham El-Deeb,et al. A Comparative Study of Game Tree Searching Methods , 2014 .
[460] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[461] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[462] David Silver,et al. Move Evaluation in Go Using Deep Convolutional Neural Networks , 2014, ICLR.
[463] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[464] H. Jaap van den Herik,et al. Scaling Monte Carlo Tree Search on Intel Xeon Phi , 2015, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS).
[465] Ralf Funke. From Gobble to Zen: The Quest for Truly Intelligent Software and the Monte Carlo Revolution in Go , 2015 .
[466] D. Jonge. Negotiations over large agreement spaces , 2015 .
[467] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[468] Philip H. S. Torr,et al. An embarrassingly simple approach to zero-shot learning , 2015, ICML.
[469] Matthew Lai,et al. Giraffe: Using Deep Reinforcement Learning to Play Chess , 2015, ArXiv.
[470] Hao Wang,et al. Optimally Weighted Cluster Kriging for Big Data Regression , 2015, IDA.
[471] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[472] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[473] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[474] Aaron Klein,et al. Efficient and Robust Automated Machine Learning , 2015, NIPS.
[475] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.
[476] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[477] Garrison W. Cottrell,et al. Basic Level Categorization Facilitates Visual Object Recognition , 2015, ArXiv.
[478] Marco Platzner,et al. Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning , 2015, ACG.
[479] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[480] H. Jaap van den Herik,et al. Past Our Prime: A Study of Age and Play Style Development in Battlefield 3 , 2015, IEEE Transactions on Computational Intelligence and AI in Games.
[481] Amos J. Storkey,et al. Training Deep Convolutional Neural Networks to Play Go , 2015, ICML.
[482] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[483] Santiago Ontañón,et al. A Benchmark for StarCraft Intelligent Agents , 2015 .
[484] Luc De Raedt,et al. Neural-Symbolic Learning and Reasoning: Contributions and Challenges , 2015, AAAI Spring Symposia.
[485] Bojun Huang,et al. Pruning Game Tree by Rollouts , 2015, AAAI.
[486] Simon M. Lucas,et al. Open Loop Search for General Video Game Playing , 2015, GECCO.
[487] Michael Thielscher,et al. Lifting Model Sampling for General Game Playing to Incomplete-Information Models , 2015, AAAI.
[488] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[489] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[490] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[491] H. Jaap van den Herik,et al. Transfer Learning of Air Combat Behavior , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).
[492] Trevor Darrell,et al. Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[493] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[494] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[495] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[496] John D. Kelleher,et al. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies , 2015 .
[497] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[498] Yuandong Tian,et al. Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.
[499] Daan Wierstra,et al. One-Shot Generalization in Deep Generative Models , 2016, ICML.
[500] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
[501] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[502] Chiara F. Sironi,et al. Comparison of rapid action value estimation variants for general game playing , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).
[503] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[504] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[505] Christin Wirth,et al. Blondie24 Playing At The Edge Of Ai , 2016 .
[506] H. Jaap van den Herik,et al. Ensemble UCT Needs High Exploitation , 2015, ICAART.
[507] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.
[508] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[509] Nando de Freitas,et al. Bayesian Optimization in a Billion Dimensions via Random Embeddings , 2013, J. Artif. Intell. Res..
[510] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[511] Leslie Pérez Cáceres,et al. The irace package: Iterated racing for automatic algorithm configuration , 2016 .
[512] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[513] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[514] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[515] Michael S. Lew,et al. Deep learning for visual understanding: A review , 2016, Neurocomputing.
[516] Thomas G. Dietterich,et al. Incorporating Expert Feedback into Active Anomaly Discovery , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[517] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[518] Nathan S. Netanyahu,et al. DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess , 2016, ICANN.
[519] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[520] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[521] Andreas Müller,et al. Introduction to Machine Learning with Python: A Guide for Data Scientists , 2016 .
[522] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[523] John Tromp,et al. A Googolplex of Go Games , 2016, Computers and Games.
[524] Aske Plaat,et al. On the Impact of Data Set Size in Transfer Learning Using Deep Neural Networks , 2016, IDA.
[525] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[526] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[527] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[528] Julian Togelius,et al. Ieee Transactions on Computational Intelligence and Ai in Games the 2014 General Video Game Playing Competition , 2022 .
[529] Koen V. Hindriks,et al. Automated Negotiating Agents Competition (ANAC) , 2017, AAAI.
[530] Chiara F. Sironi,et al. On-Line Parameter Tuning for Monte-Carlo Tree Search in General Game Playing , 2017, CGW@IJCAI.
[531] Aurélien Géron,et al. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems , 2017 .
[532] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.
[533] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[534] Nikhil Ketkar,et al. Deep Learning with Python , 2017 .
[535] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[536] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[537] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[538] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[539] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[540] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[541] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.
[542] Yuandong Tian,et al. ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games , 2017, NIPS.
[543] Koray Kavukcuoglu,et al. Combining policy gradient and Q-learning , 2016, ICLR.
[544] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[545] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[546] Ramesh Raskar,et al. Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.
[547] Catholijn M. Jonker,et al. Efficient exploration with Double Uncertain Value Networks , 2017, ArXiv.
[548] M. Kubát. An Introduction to Machine Learning , 2017, Springer International Publishing.
[549] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[550] Título D-Brane,et al. D-Brane : a diplomacy playing agent for automated negotiations research , 2017 .
[551] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[552] Malcolm I. Heywood,et al. Multi-task learning in Atari video games with emergent tangled program graphs , 2017, GECCO.
[553] Lina J. Karam,et al. A Study and Comparison of Human and Deep Learning Recognition Performance under Visual Distortions , 2017, 2017 26th International Conference on Computer Communication and Networks (ICCCN).
[554] Bernt Schiele,et al. Zero-Shot Learning — The Good, the Bad and the Ugly , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[555] Tao Zhang,et al. A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.
[556] S. Baum. A Survey of Artificial General Intelligence Projects for Ethics, Risk, and Policy , 2017 .
[557] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[558] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[559] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[560] Sergey Ioffe,et al. Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models , 2017, NIPS.
[561] H. Jaap van den Herik,et al. Structured Parallel Programming for Monte Carlo Tree Search , 2017, ArXiv.
[562] Xiaoming Liu,et al. Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[563] Geoffrey E. Hinton,et al. Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.
[564] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.
[565] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[566] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[567] Lars Kotthoff,et al. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..
[568] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning , 2017, ICLR 2018.
[569] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[570] H. Jaap van den Herik,et al. An Analysis of Virtual Loss in Parallel MCTS , 2017, ICAART.
[571] Simon M. Lucas,et al. General Video Game AI: Learning from screen capture , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).
[572] Yunguan Fu,et al. Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization , 2018, ArXiv.
[573] Chih-Cheng Lai,et al. Comparison of machine learning models for the prediction of mortality of patients with unplanned extubation in intensive care units , 2018, Scientific Reports.
[574] Aske Plaat,et al. Priming Digitisation: Learning the Textual Structure in Field Books , 2018 .
[575] Frank Hutter,et al. Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari , 2018, IJCAI.
[576] Catholijn M. Jonker,et al. The Potential of the Return Distribution for Exploration in RL , 2018, ArXiv.
[577] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[578] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.
[579] Wu Chen,et al. A Search Optimization Method for Rule Learning in Board Games , 2018, PRICAI.
[580] Tuomas Sandholm,et al. Depth-Limited Solving for Imperfect-Information Games , 2018, NeurIPS.
[581] Hui Wang,et al. Assessing the Potential of Classical Q-learning in General Game Playing , 2018, BNCAI.
[582] Wenlong Fu,et al. Model-based reinforcement learning: A survey , 2018 .
[583] Rob Fergus,et al. Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning , 2018, ArXiv.
[584] H. Jaap van den Herik,et al. Pipeline Pattern for Parallel MCTS , 2018, ICAART.
[585] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[586] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[587] Bas van Stein,et al. Automatic Configuration of Deep Neural Networks with EGO , 2018, ArXiv.
[588] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.
[589] M. Hutson. Artificial intelligence faces reproducibility crisis. , 2018, Science.
[590] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[591] Rémi Munos,et al. Learning to Search with MCTSnets , 2018, ICML.
[592] Malcolm I. Heywood,et al. Emergent Tangled Program Graphs in Multi-Task Learning , 2018, IJCAI.
[593] Henry Charlesworth,et al. Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect Information , 2018, ArXiv.
[594] Joel Z. Leibo,et al. Human-level performance in first-person multiplayer games with population-based deep reinforcement learning , 2018, ArXiv.
[595] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..
[596] Catholijn M. Jonker,et al. Monte Carlo Tree Search for Asymmetric Trees , 2018, ArXiv.
[597] Koen V. Hindriks,et al. StarCraft as a Testbed for Engineering Complex Distributed Systems Using Cognitive Agent Technology , 2018, AAMAS.
[598] Geraint Rees,et al. Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.
[599] Sergey Levine,et al. Unsupervised Meta-Learning for Reinforcement Learning , 2018, ArXiv.
[600] Julian Togelius,et al. Artificial Intelligence and Games , 2018, Springer International Publishing.
[601] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[602] Elmar Eisemann,et al. DeepEyes: Progressive Visual Analytics for Designing Deep Neural Networks , 2018, IEEE Transactions on Visualization and Computer Graphics.
[603] Takayuki Ito,et al. The Challenge of Negotiation in the Game of Diplomacy , 2018, AT.
[604] Ben Ruijl,et al. Games and loop integrals , 2018, Journal of Physics: Conference Series.
[605] H. Jaap van den Herik,et al. A Lock-free Algorithm for Parallel MCTS , 2018, ICAART.
[606] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[607] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[608] Mike Preuss,et al. Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.
[609] Kiminori Matsuzaki. Empirical Analysis of PUCT Algorithm with Evaluation Functions of Different Quality , 2018, 2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI).
[610] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[611] Youhei Akimoto,et al. Probabilistic Model-Based Dynamic Architecture Search , 2018 .
[612] Mélanie Frappier,et al. The Book of Why: The New Science of Cause and Effect , 2018, Science.
[613] Catholijn M. Jonker,et al. A0C: Alpha Zero in Continuous Action Space , 2018, ArXiv.
[614] D. Weinshall,et al. Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks , 2018, ICML.
[615] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[616] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.
[617] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[618] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[619] Joelle Pineau,et al. No Press Diplomacy: Modeling Multi-Agent Gameplay , 2019, NeurIPS.
[620] Ruiyang Xu,et al. Learning Self-Game-Play Agents for Combinatorial Optimization Problems , 2019, AAMAS.
[621] Joel Z. Leibo,et al. Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , 2019, ArXiv.
[622] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[623] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[624] Ruth Nussinov,et al. Computational Structural Biology: Successes, Future Directions, and Challenges , 2019, Molecules.
[625] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[626] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[627] Joelle Pineau,et al. Online Adaptative Curriculum Learning for GANs , 2018, AAAI.
[628] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[629] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.
[630] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[631] Tong Lu,et al. On Reinforcement Learning for Full-length Game of StarCraft , 2018, AAAI.
[632] Kouichi Sakurai,et al. One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.
[633] Mike Preuss,et al. Alternative Loss Functions in AlphaZero-like Self-play , 2019, 2019 IEEE Symposium Series on Computational Intelligence (SSCI).
[634] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[635] Amos J. Storkey,et al. How to train your MAML , 2018, ICLR.
[636] Heike Trautmann,et al. Automated Algorithm Selection: Survey and Perspectives , 2018, Evolutionary Computation.
[637] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[638] Tim Salimans,et al. Policy Gradient Search: Online Planning and Expert Iteration without Search Trees , 2019, ArXiv.
[639] Elliot Meyerson,et al. Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.
[640] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[641] Ryan B. Hayward,et al. Hex: The Full Story , 2019 .
[642] Xin Yang,et al. Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[643] David J. Wu,et al. Accelerating Self-Play Learning in Go , 2019, ArXiv.
[644] Yuxi Li,et al. Deep Reinforcement Learning , 2018, Reinforcement Learning for Cyber-Physical Systems.
[645] Dennis J. N. J. Soemers,et al. Strategic Features for General Games , 2019, KEG@AAAI.
[646] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[647] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[648] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[649] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.
[650] Junhyuk Oh,et al. Discovering Reinforcement Learning Algorithms , 2020, NeurIPS.
[651] H. Sangani,et al. DO ANDROIDS DREAM OF ELECTRIC SHEEP? , 2020, Faculty Brat.
[652] Sebastian Risi,et al. From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI , 2020, KI - Künstliche Intelligenz.
[653] Zeb Kurth-Nelson,et al. A distributional code for value in dopamine-based reinforcement learning , 2020, Nature.
[654] Yoram Bachrach,et al. Learning to Play No-Press Diplomacy with Best Response Policy Iteration , 2020, NeurIPS.
[655] Mike Preuss,et al. Model-Based Deep Reinforcement Learning for High-Dimensional Problems, a Survey , 2020, ArXiv.
[656] Matthew E. Taylor,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[657] Hui Wang,et al. Analysis of Hyper-Parameters for Small Games: Iterations or Epochs in Self-Play? , 2020, ArXiv.
[658] John Schulman,et al. Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[659] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .