Efficient Reinforcement Learning through Symbiotic Evolution

This article presents a new reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution), which evolves a population of neurons through genetic algorithms to form a neural network capable of performing a task. Symbiotic evolution promotes both cooperation and specialization, which results in a fast, efficient genetic search and discourages convergence to suboptimal solutions. In the inverted pendulum problem, SANE formed effective networks 9 to 16 times faster than the Adaptive Heuristic Critic and 2 times faster thanQ-learning and the GENITOR neuro-evolution approach without loss of generalization. Such efficient learning, combined with few domain assumptions, make SANE a promising approach to a broad range of reinforcement learning problems, including many real-world applications.

[1]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[2]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[4]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  Darrell Whitley,et al.  Genitor: a different genetic algorithm , 1988 .

[7]  J. David Schaffer,et al.  Proceedings of the third international conference on Genetic algorithms , 1989 .

[8]  C. Watkins Learning from delayed rewards , 1989 .

[9]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[10]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[11]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[12]  Hiroaki Kitano,et al.  Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..

[13]  Kai-Fu Lee,et al.  The Development of a World Class Othello Program , 1990, Artif. Intell..

[14]  Claude Sammut,et al.  Is Learning Rate a Good Performance Criterion for Learning? , 1990, ML.

[15]  Gilbert Syswerda,et al.  A Study of Reproduction in Generational and Steady State Genetic Algorithms , 1990, FOGA.

[16]  Richard K. Belew,et al.  Evolving networks: using the genetic algorithm with connectionist learning , 1990 .

[17]  Lashon B. Booker,et al.  Proceedings of the fourth international conference on Genetic algorithms , 1991 .

[18]  D. Parisi,et al.  Growing neural networks , 1991 .

[19]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[20]  David R. Jefferson,et al.  Selection in Massively Parallel Genetic Algorithms , 1991, ICGA.

[21]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[22]  Charles E. Taylor,et al.  Artificial Life II , 1991 .

[23]  John J. Grefenstette,et al.  An Approach to Anytime Learning , 1992, ML.

[24]  J. D. Schaffer,et al.  Combinations of genetic algorithms and neural networks: a survey of the state of the art , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[25]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[26]  Alan S. Perelson,et al.  Searching for Diverse, Cooperative Populations with Genetic Algorithms , 1993, Evolutionary Computation.

[27]  Risto Miikkulainen,et al.  Evolutionary Neural Networks for Value Ordering in Constraint SatisfactionProblems , 1994 .

[28]  Risto Miikkulainen,et al.  Evolving Neural Networks to Focus Minimax Search , 1994, AAAI.

[29]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[30]  Mark D. Pendrith On Reinforcement Learning of Control Actions in Noisy and Non-Markovian Domains , 1994 .

[31]  Robert E. Smith,et al.  Is a Learning Classifier System a Type of Neural Network? , 1994, Evolutionary Computation.

[32]  John J. Grefenstette,et al.  An Evolutionary Approach to Learning in Robots. , 1994 .

[33]  David E. Goldberg,et al.  Implicit Niching in a Learning Classifier System: Nature's Way , 1994, Evolutionary Computation.

[34]  John J. Grefenstette,et al.  A Coevolutionary Approach to Learning Sequential Decision Rules , 1995, ICGA.

[35]  Larry R. Medsker,et al.  Genetic Algorithms and Neural Networks , 1995 .

[36]  Christopher G. Langton,et al.  Artificial Life III , 2000 .

[37]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[38]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[39]  John J. Grefenstette,et al.  Learning sequential decision rules using simulation models and competition , 2004, Machine Learning.

[40]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[41]  L. Darrell Whitley,et al.  Genetic Reinforcement Learning for Neurocontrol Problems , 2004, Machine Learning.

[42]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[43]  Mitchell A. Potter,et al.  EVOLVING NEURAL NETWORKS WITH COLLABORATIVE SPECIES , 2006 .