Evolving Static Representations for Task Transfer

An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Previous approaches to transfer in Keepaway have focused on transforming the original representation to fit the new task. In contrast, this paper explores the idea that transfer is most effective if the representation is designed to be the same even across different tasks. To demonstrate this point, a bird's eye view (BEV) representation is introduced that can represent different tasks on the same two-dimensional map. For example, both the 3 vs. 2 and 4 vs. 3 Keepaway tasks can be represented on the same BEV. Yet the problem is that a raw two-dimensional map is high-dimensional and unstructured. This paper shows how this problem is addressed naturally by an idea from evolutionary computation called indirect encoding, which compresses the representation by exploiting its geometry. The result is that the BEV learns a Keepaway policy that transfers without further learning or manipulation. It also facilitates transferring knowledge learned in a different domain, Knight Joust, into Keepaway. Finally, the indirect encoding of the BEV means that its geometry can be changed without altering the solution. Thus static representations facilitate several kinds of transfer.

[1]  A. Lindenmayer Mathematical models for cellular interactions in development. II. Simple and branching filaments with two-sided inputs. , 1968, Journal of theoretical biology.

[2]  A. Lindenmayer Mathematical models for cellular interactions in development. I. Filaments with one-sided inputs. , 1968, Journal of theoretical biology.

[3]  Jeffrey W. Roberts,et al.  遺伝子の分子生物学 = Molecular biology of the gene , 1970 .

[4]  Yves Kodratoff,et al.  Machine and Human Learning: Advances in European Research , 1989 .

[5]  A. M. Turing,et al.  The chemical basis of morphogenesis , 1952, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences.

[6]  Gerald Tesauro,et al.  Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[7]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[8]  Lee Altenberg,et al.  Evolving better representations through selective genome growth , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[9]  I. Harvey The artificial evolution of adaptive behaviour , 1994 .

[10]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[11]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[12]  Juergen Schmidhuber,et al.  On learning how to learn learning strategies , 1994 .

[13]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[14]  Sebastian Thrun,et al.  Learning One More Thing , 1994, IJCAI.

[15]  David B. Fogel,et al.  Evolving Neural Control Systems , 1995, IEEE Expert.

[16]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[17]  Larry D. Pyeatt,et al.  A comparison between cellular encoding and direct encoding for genetic neural networks , 1996 .

[18]  Hiroaki Kitano,et al.  RoboCup: A Challenge Problem for AI , 1997, AI Mag..

[19]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[20]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[21]  Andrew P. Martin Increasing Genomic Complexity by Gene Duplication and the Origin of Vertebrates , 1999, The American Naturalist.

[22]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[23]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[24]  Peter J. Bentley,et al.  Three Ways to Grow Designs: A Comparison of Evolved Embryogenies for a Design Problem , 1999 .

[25]  Benjamin Kuipers,et al.  The Spatial Semantic Hierarchy , 2000, Artif. Intell..

[26]  Reinforcement Learning for 3 vs. 2 Keepaway , 2000, RoboCup.

[27]  Peter Stone,et al.  Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.

[28]  Jordan B. Pollack,et al.  Creating High-Level Components with a Generative Representation for Body-Brain Evolution , 2002, Artificial Life.

[29]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[30]  Eduardo F. Morales,et al.  Scaling Up Reinforcement Learning with a Relational Representation , 2003 .

[31]  Risto Miikkulainen,et al.  Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..

[32]  Luc De Raedt,et al.  Relational Reinforcement Learning , 2001, Machine Learning.

[33]  Robert Givan,et al.  Relational Reinforcement Learning: An Overview , 2004, ICML 2004.

[34]  Shimon Whiteson Improving reinforcement learning function approximators via neuroevolution , 2005, AAMAS '05.

[35]  David Steffen Bergman,et al.  Multi‐criteria optimization of ball passing in simulated soccer , 2005 .

[36]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[37]  Risto Miikkulainen,et al.  Real-time neuroevolution in the NERO video game , 2005, IEEE Transactions on Evolutionary Computation.

[38]  Risto Miikkulainen,et al.  Evolving Soccer Keepaway Players Through Task Decomposition , 2005, Machine Learning.

[39]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[40]  Nikos A. Vlassis,et al.  Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..

[41]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[42]  Peter Stone,et al.  Keepaway Soccer: From Machine Learning Testbed to Benchmark , 2005, RoboCup.

[43]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[44]  Shimon Whiteson,et al.  Comparing evolutionary and temporal difference methods in a reinforcement learning domain , 2006, GECCO.

[45]  Peter Stone,et al.  Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.

[46]  Frieder Stolzenburg,et al.  Multiagent Matching Algorithms with and without Coach , 2006 .

[47]  Kenneth O. Stanley,et al.  Generating large-scale neural networks through discovering geometric regularities , 2007, GECCO '07.

[48]  Erik Talvitie,et al.  An Experts Algorithm for Transfer Learning , 2007, IJCAI.

[49]  Shimon Whiteson,et al.  Stochastic Optimization for Collision Selection in High Energy Physics , 2006, AAAI.

[50]  Kurt Driessens,et al.  Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling , 2007, ECML.

[51]  Kenneth O. Stanley,et al.  Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[52]  Shimon Whiteson,et al.  Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.

[53]  Peter Stone,et al.  Cross-domain transfer for reinforcement learning , 2007, ICML '07.

[54]  Peter Stone,et al.  Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[55]  Frank Kirchner,et al.  Performance evaluation of EANT in the robocup keepaway benchmark , 2007, ICMLA 2007.

[56]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[57]  Kenneth O. Stanley,et al.  A Case Study on the Critical Role of Geometric Regularity in Machine Learning , 2008, AAAI.

[58]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[59]  Prasad Tadepalli,et al.  LEARNING TO SOLVE PROBLEMS FROM EXERCISES , 2008, Comput. Intell..

[60]  Kenneth O. Stanley,et al.  Generative encoding for multiagent learning , 2008, GECCO '08.

[61]  Jude W. Shavlik,et al.  Advice Taking and Transfer Learning: Naturally Inspired Extensions to Reinforcement Learning , 2008, AAAI Fall Symposium: Naturally-Inspired Artificial Intelligence.

[62]  Jude W. Shavlik,et al.  Rule Extraction for Transfer Learning , 2008, Rule Extraction from Support Vector Machines.

[63]  Alan K. Mackworth Agents, Bodies, Constraints, Dynamics, and Evolution , 2009, AI Mag..

[64]  Charles Ofria,et al.  Evolving coordinated quadruped gaits with the HyperNEAT generative encoding , 2009, 2009 IEEE Congress on Evolutionary Computation.

[65]  Daniele Loiacono,et al.  On-line neuroevolution applied to The Open Racing Car Simulator , 2009, 2009 IEEE Congress on Evolutionary Computation.

[66]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[67]  Kenneth O. Stanley A Hypercube-Based Indirect Encoding for Evolving Large-Scale Neural Networks , 2009 .

[68]  V. Ramakrishnan,et al.  Measurement of the top-quark mass with dilepton events selected using neuroevolution at CDF. , 2008, Physical review letters.

[69]  Kenneth O. Stanley,et al.  Autonomous Evolution of Topographic Regularities in Artificial Neural Networks , 2010, Neural Computation.

[70]  Joel Lehman,et al.  Evolving policy geometry for scalable multiagent learning , 2010, AAMAS.

[71]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.