Evolving multimodal behavior through modular multiobjective neuroevolution

Intelligent organisms do not simply perform one task, but exhibit multiple distinct modes of behavior. For instance, humans can swim, climb, write, solve problems, and play sports. To be fully autonomous and robust, it would be advantageous for artificial agents, both in physical and virtual worlds, to exhibit a similar diversity of behaviors. Artificial evolution, in particular neuroevolution [3, 4], is known to be capable of discovering complex agent behavior. This dissertation expands on existing neuroevolution methods, specifically NEAT (Neuro-Evolution of Augmenting Topologies [7]), to make the discovery of multiple modes of behavior possible. More specifically, it proposes four extensions: (1) multiobjective evolution, (2) sensors that are split up according to context, (3) modular neural network structures, and (4) fitness-based shaping. All of these technical contributions are incorporated into the software framework of Modular Multiobjective NEAT (MM-NEAT), which can be downloaded here.

[1]  R. Thawonmas,et al.  Automatic Controller of Ms. Pac-Man and Its Performance: Winner of the IEEE CEC 2009 Software Agent Ms. Pac-Man Competition , 2009 .

[2]  Stéphane Doncieux,et al.  Using behavioral exploration objectives to solve deceptive problems in neuro-evolution , 2009, GECCO.

[3]  Charles Ofria,et al.  Investigating whether hyperNEAT produces modular neural networks , 2010, GECCO '10.

[4]  Martinez Moises,et al.  Pac-mAnt: Optimization based on ant colonies applied to developing an agent for Ms. Pac-Man , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[5]  H. Handa,et al.  Evolutionary fuzzy systems for generating better Ms.PacMan players , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[6]  Simon M. Lucas,et al.  Using genetic programming to evolve heuristics for a Monte Carlo Tree Search Ms Pac-Man agent , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).

[7]  Sergey Levine,et al.  Feature Construction for Inverse Reinforcement Learning , 2010, NIPS.

[8]  Risto Miikkulainen,et al.  Evolving Multimodal Networks for Multitask Games , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[9]  Dot Hs,et al.  National Motor Vehicle Crash Causation Survey , 2008 .

[10]  Risto Miikkulainen,et al.  Evolving Stochastic Controller Networks for Intelligent Game Agents , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[11]  Risto Miikkulainen,et al.  Multiagent Learning through Neuroevolution , 2012, WCCI.

[12]  Kenneth O. Stanley,et al.  Constraining connectivity to encourage modularity in HyperNEAT , 2011, GECCO '11.

[13]  Stefano Nolfi,et al.  Duplication of Modules Facilitates the Evolution of Functional Specialization , 1999, Artificial Life.

[14]  András Lörincz,et al.  Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man , 2007, J. Artif. Intell. Res..

[15]  Jeff Clune,et al.  A novel generative encoding for evolving modular, regular and scalable networks , 2011, GECCO '11.

[16]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[17]  Samad Ahmadi,et al.  Reactive control of Ms. Pac Man using information retrieval based on Genetic Programming , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[18]  Roderic A. Grupen,et al.  A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..

[19]  Peter J. Fleming,et al.  An Overview of Evolutionary Algorithms in Multiobjective Optimization , 1995, Evolutionary Computation.

[20]  Takeshi Ito,et al.  Monte-Carlo tree search in Ms. Pac-Man , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[21]  Marco Wiering,et al.  Reinforcement learning to train Ms. Pac-Man using higher-order action-relative inputs , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[22]  Jan Koutník,et al.  NEAT in HyperNEAT Substituted with Genetic Programming , 2009, ICANNGA.

[23]  Simon M. Lucas,et al.  Fast Approximate Max-n Monte Carlo Tree Search for Ms Pac-Man , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[24]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[25]  Kenneth O. Stanley,et al.  A novel generative encoding for exploiting neural network sensor and output geometry , 2007, GECCO '07.

[26]  Risto Miikkulainen,et al.  Evolving multimodal behavior with modular neural networks in Ms. Pac-Man , 2014, GECCO.

[27]  Kee-Eung Kim,et al.  Bayesian Nonparametric Feature Construction for Inverse Reinforcement Learning , 2013, IJCAI.

[28]  Risto Miikkulainen,et al.  Constructing complex NPC behavior via multi-objective neuroevolution , 2008, AAAI 2008.

[29]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[30]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[31]  Martin J. Oates,et al.  PESA-II: region-based selection in evolutionary multiobjective optimization , 2001 .

[32]  Frédéric Gruau,et al.  Automatic Definition of Modular Neural Networks , 1994, Adapt. Behav..

[33]  Simon M. Lucas,et al.  Using a training camp with Genetic Programming to evolve Ms Pac-Man agents , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[34]  Risto Miikkulainen,et al.  Efficient evolution of neural networks through complexification , 2004 .

[35]  Chi Wan Sung,et al.  A Monte-Carlo approach for the endgame of Ms. Pac-Man , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[36]  John R. Koza,et al.  Genetic programming 2 - automatic discovery of reusable programs , 1994, Complex Adaptive Systems.

[37]  Sanjiv Singh,et al.  The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.

[38]  Risto Miikkulainen,et al.  Open-ended behavioral complexity for evolved virtual creatures , 2013, GECCO '13.

[39]  Kenneth O. Stanley,et al.  Generative encoding for multiagent learning , 2008, GECCO '08.

[40]  Bernhard Hengst,et al.  Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[41]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[42]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[43]  András Lörincz,et al.  Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.

[44]  Silvia Ferrari,et al.  A model-based cell decomposition approach to on-line pursuit-evasion path planning and the video game Ms. Pac-Man , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[45]  Dana H. Ballard,et al.  Genetic Programming with Adaptive Representations , 1994 .

[46]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[47]  C. A. Coello Coello,et al.  A Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques , 1999, Knowledge and Information Systems.

[48]  Hod Lipson,et al.  The evolutionary origins of modularity , 2012, Proceedings of the Royal Society B: Biological Sciences.

[49]  J. Chai,et al.  Editorial , 1999, Mechanisms of Ageing and Development.

[50]  P. Hingston Believable Bots: Can Computers Play Like People? , 2012 .

[51]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[52]  Peter Stone,et al.  Layered learning in multiagent systems - a winning approach to robotic soccer , 2000, Intelligent robotics and autonomous agents.

[53]  Bruno Sareni,et al.  Fitness sharing and niching methods revisited , 1998, IEEE Trans. Evol. Comput..

[54]  Simon M. Lucas,et al.  Evolving diverse Ms. Pac-Man playing agents using genetic programming , 2010, 2010 UK Workshop on Computational Intelligence (UKCI).

[55]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[56]  Jing Shen,et al.  Multi-robot Cooperation Based on Hierarchical Reinforcement Learning , 2007, International Conference on Computational Science.

[57]  Xin Yao,et al.  Co-evolutionary modular neural networks for automatic problem decomposition , 2005, 2005 IEEE Congress on Evolutionary Computation.

[58]  Ruck Thawonmas,et al.  Evolution strategy for optimizing parameters in Ms Pac-Man controller ICE Pambush 3 , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[59]  Risto Miikkulainen,et al.  Automatic feature selection in neuroevolution , 2005, GECCO '05.

[60]  Joel Lehman,et al.  Task switching in multirobot learning through indirect encoding , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[61]  Jonathan Klein,et al.  breve: a 3D environment for the simulation of decentralized systems and artificial life , 2002 .

[62]  John J. Grefenstette,et al.  Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..

[63]  Justinian P. Rosca,et al.  Discovery of subroutines in genetic programming , 1996 .

[64]  K. Subramanian,et al.  Learning Options through Human Interaction , 2011 .

[65]  Zbigniew Michalewicz,et al.  Evolutionary Computation 2 , 2000 .

[66]  Hui Li,et al.  Semisupervised Multitask Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Johan Svensson,et al.  Influence Map-based controllers for Ms. PacMan and the ghosts , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[68]  Lothar Thiele,et al.  A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers , 2006 .

[69]  John J. Grefenstette,et al.  Deception Considered Harmful , 1992, FOGA.

[70]  Dario Floreano,et al.  Neuroevolution: from architectures to learning , 2008, Evol. Intell..

[71]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[72]  Martin J. Oates,et al.  The Pareto Envelope-Based Selection Algorithm for Multi-objective Optimisation , 2000, PPSN.

[73]  Simon M. Lucas,et al.  A simple tree search method for playing Ms. Pac-Man , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[74]  Simon M. Lucas,et al.  Evolving a Neural Network Location Evaluator to Play Ms. Pac-Man , 2005, CIG.

[75]  Scott Kuindersma,et al.  Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.

[76]  Doina Precup,et al.  Learning Options in Reinforcement Learning , 2002, SARA.

[77]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[78]  Justinian Rosca,et al.  Generality versus size in genetic programming , 1996 .

[79]  Risto Miikkulainen,et al.  Evolving neural networks for strategic decision-making problems , 2009, Neural Networks.

[80]  R. Bellman Dynamic programming. , 1957, Science.

[81]  Robin R. Murphy,et al.  Disaster Robotics , 2014, Springer Handbook of Robotics, 2nd Ed..

[82]  César Estébanez,et al.  AntBot: Ant Colonies for Video Games , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[83]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[84]  John Levine,et al.  Improving control through subsumption in the EvoTanks domain , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[85]  Julian Togelius,et al.  Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[86]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[87]  Sebastian Thrun,et al.  Clustering Learning Tasks and the Selective Cross-Task Transfer of Knowledge , 1998, Learning to Learn.

[88]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[89]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[90]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers , 2002 .

[91]  Tze-Yun Leong,et al.  Online Feature Selection for Model-based Reinforcement Learning , 2013, ICML.

[92]  Risto Miikkulainen,et al.  Evolving agent behavior in multiobjective domains using fitness-based shaping , 2010, GECCO '10.

[93]  Marcus Gallagher,et al.  Evolving Pac-Man Players: Can We Learn from Raw Input? , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[94]  Julian Togelius,et al.  Evolution of a subsumption architecture neurocontroller , 2004, J. Intell. Fuzzy Syst..

[95]  John DeNero,et al.  Teaching Introductory Artificial Intelligence with Pac-Man , 2010, Proceedings of the AAAI Conference on Artificial Intelligence.

[96]  Manuela M. Veloso,et al.  Layered Learning , 2000, ECML.

[97]  Dario Floreano,et al.  Genetic Team Composition and Level of Selection in the Evolution of Cooperation , 2009, IEEE Transactions on Evolutionary Computation.

[98]  Frank W. Ciarallo,et al.  Multiobjectivization via Helper-Objectives With the Tunable Objectives Problem , 2012, IEEE Transactions on Evolutionary Computation.

[99]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[100]  Jan Peters,et al.  Toward fast policy search for learning legged locomotion , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[101]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[102]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[103]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[104]  Risto Miikkulainen,et al.  Coevolution of Role-Based Cooperation in Multiagent Systems , 2009, IEEE Transactions on Autonomous Mental Development.

[105]  Paolo Fiorini,et al.  Search and Rescue Robotics , 2008, Springer Handbook of Robotics.

[106]  Risto Miikkulainen,et al.  A Taxonomy for Artificial Embryogeny , 2003, Artificial Life.

[107]  Simon M. Lucas,et al.  Evolution versus Temporal Difference Learning for learning to play Ms. Pac-Man , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[108]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[109]  Peter Stone,et al.  An empirical analysis of value function-based and policy search reinforcement learning , 2009, AAMAS.

[110]  Risto Miikkulainen,et al.  Evolving adaptive neural networks with and without adaptive synapses , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[111]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[112]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[113]  Dario Floreano,et al.  Evolutionary robots with on-line self-organization and behavioral fitness , 2000, Neural Networks.

[114]  Andrew Y. Ng,et al.  Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.

[115]  Lothar Thiele,et al.  The Hypervolume Indicator Revisited: On the Design of Pareto-compliant Indicators Via Weighted Integration , 2007, EMO.

[116]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[117]  Risto Miikkulainen,et al.  The role of reward structure, coordination mechanism and net return in the evolution of cooperation , 2011, CIG.

[118]  Mark H. M. Winands,et al.  Enhancements for Monte-Carlo Tree Search in Ms Pac-Man , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[119]  Alexander Zelinsky,et al.  Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.

[120]  Marcus Gallagher,et al.  An influence map model for playing Ms. Pac-Man , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[121]  Donald Michie,et al.  BOXES: AN EXPERIMENT IN ADAPTIVE CONTROL , 2013 .

[122]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[123]  Risto Miikkulainen,et al.  Evolving multi-modal behavior in NPCs , 2009, CIG.

[124]  Kenneth O. Stanley Exploiting Regularity Without Development , 2006, AAAI Fall Symposium: Developmental Systems.

[125]  MahadevanSridhar,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003 .

[126]  Daniele Loiacono,et al.  Evolving competitive car controllers for racing games with neuroevolution , 2009, GECCO '09.

[127]  Kenneth O. Stanley,et al.  Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.

[128]  Peter Redgrave,et al.  Layered Control Architectures in Robots and Vertebrates , 1999, Adapt. Behav..

[129]  Vijay Kumar,et al.  A Framework and Architecture for Multirobot Coordination , 2000, International Symposium on Experimental Robotics.

[130]  Xin Yao,et al.  Neural-Based Learning Classifier Systems , 2008, IEEE Transactions on Knowledge and Data Engineering.

[131]  Xiaoping Chen,et al.  RoboCup 2012: Robot Soccer World Cup XVI , 2013, Lecture Notes in Computer Science.

[132]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.