Accelerated Neural Evolution through Cooperatively Coevolved Synapses

Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by using direct interaction with the task environment, but have so far not scaled well to large state spaces and environments that are not fully observable. In recent years, neuroevolution, the artificial evolution of neural networks, has had remarkable success in tasks that exhibit these two properties. In this paper, we compare a neuroevolution method called Cooperative Synapse Neuroevolution (CoSyNE), that uses cooperative coevolution at the level of individual synaptic weights, to a broad range of reinforcement learning algorithms on very difficult versions of the pole balancing problem that involve large (continuous) state spaces and hidden state. CoSyNE is shown to be significantly more efficient and powerful than the other methods on these tasks.

[1]  James S. Albus,et al.  I A New Approach to Manipulator Control: The I Cerebellar Model Articulation Controller , 1975 .

[2]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[3]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[4]  John H. Holland,et al.  COGNITIVE SYSTEMS BASED ON ADAPTIVE ALGORITHMS1 , 1978 .

[5]  Donald A. Waterman,et al.  Pattern-Directed Inference Systems , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[7]  Tariq Samad,et al.  Towards the Genetic Synthesisof Neural Networks , 1989, ICGA.

[8]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[9]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[10]  Hiroaki Kitano,et al.  Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..

[11]  Richard K. Belew,et al.  Evolving networks: using the genetic algorithm with connectionist learning , 1990 .

[12]  Phil Husbands,et al.  Simulated Co-Evolution as the Mechanism for Emergent Planning and Scheduling , 1991, International Conference on Genetic Algorithms.

[13]  A. P. Wieland,et al.  Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[14]  Jyh-Shing Roger Jang,et al.  Self-learning fuzzy controllers based on temporal backpropagation , 1992, IEEE Trans. Neural Networks.

[15]  John R. Koza,et al.  Genetic programming (videotape): the movie , 1992 .

[16]  Long Lin,et al.  Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .

[17]  R. P. Wiegand,et al.  UNIVERSITY OF DORTMUND REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531 Design and Management of Complex Technical Processes and Systems by means of Computational Intelligence Methods The Cooperative Coevolutionary ( 1 + 1 ) EA , 1993 .

[18]  Martin Mandischer,et al.  Representation and Evolution of Neural Networks , 1993 .

[19]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[20]  Jan Paredis,et al.  Steps towards Coevolutionary Classification Neural Networks , 1994 .

[21]  Randall D. Beer,et al.  Integrating reactive, sequential, and learning behavior using dynamical neural networks , 1994 .

[22]  David E. Goldberg,et al.  Implicit Niching in a Learning Classifier System: Nature's Way , 1994, Evolutionary Computation.

[23]  Jan Paredis,et al.  Coevolutionary computation , 1995 .

[24]  David B. Fogel,et al.  Evolving Neural Control Systems , 1995, IEEE Expert.

[25]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[26]  Stefano Nolfi,et al.  Learning to Adapt to Changing Environments in Evolving Neural Networks , 1996, Adapt. Behav..

[27]  Bruce A. Whitehead,et al.  Cooperative-competitive genetic evolution of radial basis function centers and widths for time series prediction , 1996, IEEE Trans. Neural Networks.

[28]  Larry D. Pyeatt,et al.  A comparison between cellular encoding and direct encoding for genetic neural networks , 1996 .

[29]  Jordan B. Pollack,et al.  Coevolution of a Backgammon Player , 1996 .

[30]  Paul J. Darwen,et al.  Co-Evolutionary Learning by Automatic Modularisation with Speciation , 1996 .

[31]  R. Eriksson,et al.  Cooperative Coevolution in Inventory Control Optimisation , 1997, ICANNGA.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  David E. Moriarty,et al.  Symbiotic Evolution of Neural Networks in Sequential Decision Tasks , 1997 .

[34]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[35]  Richard K. Belew,et al.  Coevolutionary search among adversaries , 1997 .

[36]  Risto Miikkulainen,et al.  Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[37]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[38]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[39]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[40]  X. Yao Evolving Artificial Neural Networks , 1999 .

[41]  Frank M. Marchak,et al.  That vision thing , 1999, SGCH.

[42]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[43]  Kee-Eung Kim,et al.  Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.

[44]  Risto Miikkulainen,et al.  A neuro-evolution method for dynamic resource allocation on a chip multiprocessor , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[45]  R. Paul Wiegand,et al.  An empirical analysis of collaboration methods in cooperative coevolutionary algorithms , 2001 .

[46]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[47]  Andres S Perez-Bergquist Applying ESP and Region Specialists to Neuro-Evolution for Go , 2001 .

[48]  Alex Lubberts and Risto Miikkulainen Co-Evolving a Go-Playing Neural network , 2001 .

[49]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[50]  Risto Miikkulainen,et al.  Numerical optimization with neuroevolution , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[51]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[52]  Christian Igel,et al.  Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[53]  Risto Miikkulainen,et al.  Active Guidance for a Finless Rocket Using Neuroevolution , 2003, GECCO.

[54]  Thomas Jansen,et al.  Exploring the Explorative Advantage of the Cooperative Coevolutionary (1+1) EA , 2003, GECCO.

[55]  Risto Miikkulainen,et al.  Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.

[56]  Risto Miikkulainen,et al.  Robust non-linear control through neuroevolution , 2003 .

[57]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[58]  L. D. Whitley,et al.  Genetic Reinforcement Learning for Neurocontrol Problems , 2004, Machine Learning.

[59]  Rudolf Paul Wiegand,et al.  An analysis of cooperative coevolutionary algorithms , 2004 .

[60]  Gerald Tesauro,et al.  Practical issues in temporal difference learning , 1992, Machine Learning.

[61]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[62]  Risto Miikkulainen,et al.  Efficient evolution of neural networks through complexification , 2004 .

[63]  Risto Miikkulainen,et al.  Effective image compression using evolved wavelets , 2005, GECCO '05.

[64]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[65]  Mitchell A. Potter,et al.  EVOLVING NEURAL NETWORKS WITH COLLABORATIVE SPECIES , 2006 .

[66]  Sean Luke,et al.  Archive-based cooperative coevolutionary algorithms , 2006, GECCO '06.

[67]  Jürgen Schmidhuber,et al.  Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.