Forcing Neurocontrollers to Exploit Sensory Symmetry Through Hard-wired Modularity in the Game of Cellz

Several attempts have been made in the past to construct encoding schemes that allow modularity to emerge in evolving systems, but success is limited. We believe that in order to create successful and scalable encodings for emerging modularity, we first need to explore the benefits of different types of modularity by hard-wiring these into evolvable systems. In this paper we explore different ways of exploiting sensory symmetry inherent in the agent in the simple game Cellz by evolving symmetrically identical modules. It is concluded that significant increases in both speed of evolution and final fitness can be achieved relative to monolithic controllers. Furthermore, we show that a simple function approximation task that exhibits sensory symmetry can be used as a quick approximate measure of the utility of an encoding scheme for the more complex game-playing task.

[1]  Claude E. Shannon,et al.  Programming a computer for playing chess , 1950 .

[2]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[3]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[4]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[5]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[6]  A. Lindenmayer Mathematical models for cellular interactions in development. II. Simple and branching filaments with two-sided inputs. , 1968, Journal of theoretical biology.

[7]  W. Royce Managing the development of large software systems: concepts and techniques , 2021, ICSE '87.

[8]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[9]  J M Smith,et al.  Evolution and the theory of games , 1976 .

[10]  Roger C. Schank,et al.  Dynamic memory - a theory of reminding and learning in computers and people , 1983 .

[11]  A. Bagchi,et al.  Search Algorithms Under Different Kinds of Heuristics—A Comparative Study , 1983, JACM.

[12]  Alexander Reinefeld,et al.  An Improvement to the Scout Tree Search Algorithm , 1983, J. Int. Comput. Games Assoc..

[13]  B Boehm A spiral model of software development and enhancement , 1986, SOEN.

[14]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[15]  Christopher K. Riesbeck,et al.  Inside Case-Based Reasoning , 1989 .

[16]  Albert L. Zobrist,et al.  A New Hashing Method with Application for Game Playing , 1990 .

[17]  P. P. Chakrabarti,et al.  Reducing Reexpansions in Iterative-Deepening Search by Controlling Cutoff Bounds , 1991, Artif. Intell..

[18]  L. Cosmides,et al.  The Adapted mind : evolutionary psychology and the generation of culture , 1992 .

[19]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[20]  John H. Holland,et al.  When will a Genetic Algorithm Outperform Hill Climbing , 1993, NIPS.

[21]  Bernd Brügmann Max-Planck Monte Carlo Go , 1993 .

[22]  Terrence J. Sejnowski,et al.  Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.

[23]  Frédéric Gruau,et al.  Genetic Synthesis of Modular Neural Networks , 1993, ICGA.

[24]  Dana S. Nau,et al.  ITS: An Efficient Limited-Memory Heuristic Tree Search Algorithm , 1994, AAAI.

[25]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[26]  Anil K. Jain,et al.  Neural networks and pattern recognition , 1994 .

[27]  Thomas Wolf The program GoTools and its computer-generated tsume go database , 1994 .

[28]  Alexander Reinefeld,et al.  Enhanced Iterative-Deepening Search , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[30]  Gil Tidhar,et al.  The Challenge of Whole Air Mission Modelling , 1995 .

[31]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[32]  Richard K. Belew,et al.  Methods for Competitive Co-Evolution: Finding Opponents Worth Beating , 1995, ICGA.

[33]  Jean-Arcady Meyer,et al.  Evolution and development of control architectures in animats , 1995, Robotics Auton. Syst..

[34]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[35]  Howard A. Landman,et al.  Eyespace Values in Go , 1996 .

[36]  Gregory S. Parnell,et al.  An Air Mission Planning Algorithm Using Decision Analysis and Mixed Integer Programming , 1997, Oper. Res..

[37]  Sushil J. Louis,et al.  Solving Similar Problems Using Genetic Algorithms and Case-Based Memory , 1997, ICGA.

[38]  D. Fogel,et al.  On the instability of evolutionary stable strategies. , 1997, Bio Systems.

[39]  Russell C. Eberhart,et al.  The particle swarm: social adaptation in information-processing systems , 1999 .

[40]  X. Yao Evolving Artificial Neural Networks , 1999 .

[41]  Martin Müller Decomposition Search: A Combinatorial Games Approach to Game Tree Search, with Applications to Solving Go Endgames , 1999, IJCAI.

[42]  Timothy J. Taylor,et al.  From artificial evolution to artificial life , 1999 .

[43]  Matthew L. Ginsberg,et al.  GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[44]  Jonathan Schaeffer,et al.  Using Probabilistic Knowledge and Simulation to Play Poker , 1999, AAAI/IAAI.

[45]  Richard E. Korf,et al.  Divide-and-Conquer Frontier Search Applied to Optimal Sequence Alignment , 2000, AAAI/IAAI.

[46]  Thomas Wolf Forward Pruning and Other Heuristic Search Techniques in Tsume Go , 2000, Inf. Sci..

[47]  Jonathan Schaeffer,et al.  The games computers (and people) play , 2000, Adv. Comput..

[48]  Graham Kendall,et al.  An Investigation of an Adaptive Poker Player , 2001, Australian Joint Conference on Artificial Intelligence.

[49]  John E. Laird,et al.  Using a Computer Game to Develop Advanced AI , 2001, Computer.

[50]  Bruce Blumberg,et al.  A Layered Brain Architecture for Synthetic Creatures , 2001, IJCAI.

[51]  Brian Mac Namee,et al.  Research Directions for AI in Computer Games , 2001 .

[52]  David B. Fogel,et al.  Evolving an expert checkers playing program without using human expertise , 2001, IEEE Trans. Evol. Comput..

[53]  Tristan Cazenave,et al.  Automatic Acquisition of Tactical Go Rules , 2001 .

[54]  Labo IA Saint-Denis GENERATION OF PATTERNS WITH EXTERNAL CONDITIONS FOR THE GAME OF GO , 2001 .

[55]  M. Keijzer,et al.  Evolving Objects: A General Purpose Evolutionary Computation Library , 2001, Artificial Evolution.

[56]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[57]  Kagan Tumer,et al.  Reinforcement Learning in Distributed Domains: Beyond Team Games , 2001, IJCAI.

[58]  Bruno Bouzy,et al.  Computer Go: An AI oriented survey , 2001, Artif. Intell..

[59]  Tom Lenaerts,et al.  Learning agents in a homo egualis society , 2001 .

[60]  John E. Laird,et al.  Human-Level AI's Killer Application: Interactive Computer Games , 2000, AI Mag..

[61]  Jordan B. Pollack,et al.  Evolution of generative design systems for modular physical robots , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[62]  Peter Stone,et al.  Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.

[63]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[64]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[65]  Bruce Blumberg New Challenges for Character-Based AI for Games , 2002 .

[66]  John E. Laird,et al.  Research in human-level AI using computer games , 2002, CACM.

[67]  David H. Wolpert,et al.  Designing agent collectives for systems with markovian dynamics , 2002, AAMAS '02.

[68]  Jonathan Schaeffer,et al.  The challenge of poker , 2002, Artif. Intell..

[69]  Martin Müller,et al.  Computer Go , 2002, Artif. Intell..

[70]  Eric O. Postma,et al.  Local Move Prediction in Go , 2002, Computers and Games.

[71]  長井 歩,et al.  Df-pn algorithm for searching AND/OR trees and its applications , 2002 .

[72]  J. Bullinaria To Modularize or Not To Modularize ? , 2002 .

[73]  A. Waters Winning ways. , 2002, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[74]  Mark Collins,et al.  AI Techniques for Game Programming , 2002 .

[75]  Jonathan Schaeffer,et al.  Games, computers, and artificial intelligence , 2002, Artif. Intell..

[76]  B. Bouzy Go patterns generated by retrograde analysis , 2002 .

[77]  Sushil J. Louis,et al.  Learning from Experience: Case Injected Genetic Algorithm Design of Combinational Logic Circuits , 2002 .

[78]  Frans van den Bergh,et al.  An analysis of particle swarm optimizers , 2002 .

[79]  Martin Müller Position Evaluation in Computer Go , 2002, J. Int. Comput. Games Assoc..

[80]  Tristan Cazenave A Generalized Threats Search Algorithm , 2002, Computers and Games.

[81]  Andries Petrus Engelbrecht,et al.  Comparing PSO structures to learn the game of checkers from zero knowledge , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[82]  Stephen R. McLean,et al.  Multi-Agent Cooperation using Trickle-Down Utility , 2003 .

[83]  Risto Miikkulainen,et al.  A Taxonomy for Artificial Embryogeny , 2003, Artificial Life.

[84]  Bruno Bouzy,et al.  Monte-Carlo Go Developments , 2003, ACG.

[85]  Markus Gross,et al.  Towards a game agent , 2003 .

[86]  Shimon Whiteson,et al.  Concurrent layered learning , 2003, AAMAS '03.

[87]  Manuela M. Veloso,et al.  Simultaneous Adversarial Multi-Robot Learning , 2003, IJCAI.

[88]  Ivan Bratko,et al.  A Program for Playing Tarok , 2003, J. Int. Comput. Games Assoc..

[89]  Mathematisch-naturwissenschaftlichen Fakult,et al.  Incremental Approaches to the Combined Evolution of a Robot's Body and Brain , 2003 .

[90]  Akihiro Kishimoto,et al.  DF-PN in Go: An Application to the One-Eye Problem , 2003, ACG.

[91]  José Nelson Amaral,et al.  Crafting Data Structures: A Study of Reference Locality in Refinement-Based Pathfinding , 2003, HiPC.

[92]  Jonathan Schaeffer,et al.  Comparison of Different Grid Abstractions for Pathfinding on Maps , 2003, IJCAI.

[93]  G. Wagner,et al.  What does it take to evolve behaviorally complex organisms? , 2003, Bio Systems.

[94]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[95]  Josh Bongard,et al.  Incremental approaches to the combined evolution of a robot''s body and brain , 2003 .

[96]  Alan MacCormack,et al.  Managing the Sources of Uncertainty: Matching Process and Context in Software Development , 2003 .

[97]  Erik Bethke,et al.  Game development and production , 2003 .

[98]  Bruno Bouzy,et al.  Associating Shallow and Selective Global Tree Search with Monte Carlo for 9*9 Go , 2004, Computers and Games.

[99]  Simon M. Lucas,et al.  Cellz: a simple dynamic game for testing evolutionary algorithms , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[100]  Tristan Cazenave Generalized Widening , 2004, ECAI.

[101]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[102]  Andries Petrus Engelbrecht,et al.  Evolving intelligent game-playing agents , 2004, South Afr. Comput. J..

[103]  Dario Floreano,et al.  Coevolution of active vision and feature selection , 2004, Biological Cybernetics.

[104]  Julian Togelius,et al.  Evolution of a subsumption architecture neurocontroller , 2004, J. Intell. Fuzzy Syst..

[105]  Jordan B. Pollack,et al.  Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[106]  Edwin D. de Jong,et al.  Intransitivity in Coevolution , 2004, PPSN.

[107]  Terrence E. Brown Skunk works: a sign of failure, a sign of hope? , 2004 .

[108]  Sushil J. Louis,et al.  Learning to play like a human: case injected genetic algorithms for strategic computer gaming , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[109]  Sushil J. Louis,et al.  Learning with case-injected genetic algorithms , 2004, IEEE Transactions on Evolutionary Computation.

[110]  Brian Sheppard,et al.  Efficient Control of Selective Simulations , 2004, J. Int. Comput. Games Assoc..

[111]  Gil Tidhar,et al.  Flying Together: Modelling Air Mission Teams , 1998, Applied Intelligence.

[112]  Cornelis J. Franken,et al.  PSO-based coevolutionary Game Learning , 2004 .

[113]  Risto Miikkulainen,et al.  Efficient evolution of neural networks through complexification , 2004 .

[114]  Simon M. Lucas,et al.  Exploiting Reflection in Object Oriented Genetic Programming , 2004, EuroGP.

[115]  Sushil J. Louis,et al.  Trap Avoidance in Strategic Computer Game Playing with Case Injected Genetic Algorithms , 2004, GECCO.

[116]  Bruno Bouzy,et al.  Associating domain-dependent knowledge and Monte Carlo approaches within a Go program , 2005, Inf. Sci..

[117]  Tristan Cazenave,et al.  Search for transitive connections , 2005, Inf. Sci..

[118]  Sushil J. Louis,et al.  Genetic learning for combinational logic design , 2005, Soft Comput..

[119]  H. Jaap van den Herik,et al.  Learning to predict life and death from Go game records , 2005, Inf. Sci..

[120]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[121]  Simon M. Lucas,et al.  Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[122]  George Dimitri Konidaris,et al.  An Architecture for Behavior-Based Reinforcement Learning , 2005, Adapt. Behav..

[123]  Clinton Heinze,et al.  Air Combat Tactics Implementation in the Smart Whole AiR Mission Model (SWARMM) , 2007 .