Solving Multiple Isolated, Interleaved, and Blended Tasks through Modular Neuroevolution

Many challenging sequential decision-making problems require agents to master multiple tasks. For instance, game agents may need to gather resources, attack opponents, and defend against attacks. Learning algorithms can thus benefit from having separate policies for these tasks, and from knowing when each one is appropriate. How well this approach works depends on how tightly coupled the tasks are. Three cases are identified: Isolated tasks have distinct semantics and do not interact, interleaved tasks have distinct semantics but do interact, and blended tasks have regions where semantics from multiple tasks overlap. Learning across multiple tasks is studied in this article with Modular Multiobjective NEAT, a neuroevolution framework applied to three variants of the challenging Ms. Pac-Man video game. In the standard blended version of the game, a surprising, highly effective machine-discovered task division surpasses human-specified divisions, achieving the best scores to date in this game. In isolated and interleaved versions of the game, human-specified task divisions are also successful, though the best scores are surprisingly still achieved by machine discovery. Modular neuroevolution is thus shown to be capable of finding useful, unexpected task divisions better than those apparent to a human designer.

[1]  Kenneth O. Stanley,et al.  Constraining connectivity to encourage modularity in HyperNEAT , 2011, GECCO '11.

[2]  Samad Ahmadi,et al.  Reactive control of Ms. Pac Man using information retrieval based on Genetic Programming , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[3]  Simon M. Lucas,et al.  Ms Pac-Man versus Ghost Team CEC 2011 competition , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[4]  Simon M. Lucas,et al.  A simple tree search method for playing Ms. Pac-Man , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[5]  Risto Miikkulainen,et al.  Solving Interleaved and Blended Sequential Decision-Making Problems through Modular Neuroevolution , 2015, GECCO.

[6]  Simon M. Lucas,et al.  Fast Approximate Max-n Monte Carlo Tree Search for Ms Pac-Man , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[7]  Larry Bull,et al.  A spiking neural representation for XCSF , 2010, IEEE Congress on Evolutionary Computation.

[8]  Jean-Baptiste Mouret,et al.  Evolving neural networks that are both modular and regular: HyperNEAT plus the connection cost technique , 2014, GECCO.

[9]  Risto Miikkulainen,et al.  Open-ended behavioral complexity for evolved virtual creatures , 2013, GECCO '13.

[10]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[11]  Justinian P. Rosca,et al.  Discovery of subroutines in genetic programming , 1996 .

[12]  Simon M. Lucas,et al.  Using a training camp with Genetic Programming to evolve Ms Pac-Man agents , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[13]  Simon M. Lucas,et al.  Evolution versus Temporal Difference Learning for learning to play Ms. Pac-Man , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[14]  Simon M. Lucas,et al.  Evolving a Neural Network Location Evaluator to Play Ms. Pac-Man , 2005, CIG.

[15]  Hod Lipson,et al.  The evolutionary origins of modularity , 2012, Proceedings of the Royal Society B: Biological Sciences.

[16]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[17]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[18]  Julian Togelius,et al.  Evolution of a subsumption architecture neurocontroller , 2004, J. Intell. Fuzzy Syst..

[19]  Manuela M. Veloso,et al.  Layered Learning , 2000, ECML.

[20]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[21]  Simon M. Lucas,et al.  Evolving diverse Ms. Pac-Man playing agents using genetic programming , 2010, 2010 UK Workshop on Computational Intelligence (UKCI).

[22]  César Estébanez,et al.  AntBot: Ant Colonies for Video Games , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[23]  Jean-Arcady Meyer,et al.  Evolution and Development of Modular Control Architectures for 1D Locomotion in Six-legged Animats , 1998, Connect. Sci..

[24]  Jeff Clune,et al.  A novel generative encoding for evolving modular, regular and scalable networks , 2011, GECCO '11.

[25]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[26]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[27]  Takeshi Ito,et al.  Monte-Carlo tree search in Ms. Pac-Man , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[28]  Stéphane Doncieux,et al.  MENNAG: a modular, regular and hierarchical encoding for neural-networks based on attribute grammars , 2008, Evol. Intell..

[29]  Frédéric Gruau,et al.  Automatic Definition of Modular Neural Networks , 1994, Adapt. Behav..

[30]  U. Alon,et al.  Spontaneous evolution of modularity and network motifs. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  John Levine,et al.  Improving control through subsumption in the EvoTanks domain , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[32]  Julian Togelius,et al.  Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[33]  Justinian Rosca,et al.  Generality versus size in genetic programming , 1996 .

[34]  Richard A. Watson,et al.  Reducing Local Optima in Single-Objective Problems by Multi-objectivization , 2001, EMO.

[35]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[36]  Sebastian Thrun,et al.  Clustering Learning Tasks and the Selective Cross-Task Transfer of Knowledge , 1998, Learning to Learn.

[37]  Simon M. Lucas,et al.  Using genetic programming to evolve heuristics for a Monte Carlo Tree Search Ms Pac-Man agent , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).

[38]  Risto Miikkulainen,et al.  Evolving Multimodal Networks for Multitask Games , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[39]  Larry Bull,et al.  A Neural Learning Classifier System with Self-Adaptive Constructivism for Mobile Robot Control , 2006, Artificial Life.

[40]  Stefano Nolfi,et al.  Duplication of Modules Facilitates the Evolution of Functional Specialization , 1999, Artificial Life.

[41]  Risto Miikkulainen,et al.  Evolving neural networks for strategic decision-making problems , 2009, Neural Networks.

[42]  Mark H. M. Winands,et al.  Real-Time Monte Carlo Tree Search in Ms Pac-Man , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[43]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[44]  Risto Miikkulainen,et al.  Evolving multimodal behavior with modular neural networks in Ms. Pac-Man , 2014, GECCO.

[45]  Xin Yao,et al.  Neural-Based Learning Classifier Systems , 2008, IEEE Transactions on Knowledge and Data Engineering.

[46]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[47]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[48]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[49]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[50]  Bernhard Hengst,et al.  Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[51]  Risto Miikkulainen,et al.  Real-Time Evolution of Neural Networks in the NERO Video Game , 2006, AAAI.