论文信息 - The Evolution of Learning: Balancing adaptivity and stability in artificial agents

The Evolution of Learning: Balancing adaptivity and stability in artificial agents

A longstanding challenge in artificial intelligence is to create agents that learn, enabling them to interact with and adapt to a complex and changing world. A better understanding of the evolution of learning may help produce robust and adaptive agents, as well as shed light on open questions about the evolution of learning from biology. Evolutionary computation offers the benefits of precise experimental control, repeatability of experiments and rapid generational turnover – enabling experiments to test hypotheses that would be impossible or extremely time demanding to test in natural studies. The evolution of learning is influenced by the balance between the benefits offered by adaptivity and the costs (disadvantages) individuals pay for learning abilities. Such costs include forgetting previous knowledge, dangers of exploration and maintenance of neural structures for learning. This thesis focuses on how evolution regulates learning capacities to reap the benefits of being adaptive, while minimizing the costs of learning. The regulation of learning capacities is studied along three main axes: regulation through individual lifetimes, regulation within a population facing varying environments and regulation across neural modules. The study of learning regulation within individual lifetimes is inspired by the sensitive periods in learning observed in nature: limited periods within individuals’ lives where learning is temporarily facilitated. Experiments herein demonstrate that sensitive periods can emerge to schedule learning in tasks where there are dependencies between the learning of sub-tasks, and further explore how the flexibility of evolved sensitive periods depends on assumptions about which factors regulate plasticity. On the population level, the evolution of learning efforts is known to be highly dependent on the variability of the environment and the reliability of environmental stimuli. Evolving the innate preferences and learning rates of individuals across a wide range of environmental variability demonstrates that environments changing too rapidly or too slowly discourage the evolution of learning. Further experiments show how independently varying the degrees of environmental stability and stimuli reliability leads to a refinement of this model of learning, which also acknowledges the fact that learning may be disruptive or inefficient when stimuli are not reliable. One cost of learning is the risk of losing old information as new information is gained, a problem known as catastrophic forgetting. Evolving individuals facing a task with potential for catastrophic forgetting, it is demonstrated how the addition of an evolutionary cost of neural connections leads to more modular networks, which forget old skills less when learning a new skill. Together, the findings herein demonstrate several ways to handle the so-called stabilityplasticity dilemma: how can an individual be realized which has the flexibility to adapt without risking unstable behaviors and forgetting of old skills? The findings suggest ways in which evolution may have solved this problem in natural learners, and ways to harness the powers of evolution to mitigate this problem in artificial agents.

Kai Olav Ellefsen | K. Ellefsen

[1] Charles Ofria,et al. Investigating whether hyperNEAT produces modular neural networks , 2010, GECCO '10.

[2] Jean-Baptiste Mouret,et al. On the Relationships between Generative Encodings, Regularity, and Learning Abilities when Evolving Plastic Artificial Neural Networks , 2013, PloS one.

[3] Dario Floreano,et al. Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios , 2008, ALIFE.

[4] Raul Rodriguez-Esteban,et al. Global optimization of cerebral cortex layout. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5] E A Leicht,et al. Community structure in directed networks. , 2007, Physical review letters.

[6] Kai Olav Ellefsen,et al. The Evolution of Learning Under Environmental Variability , 2014, ALIFE.

[7] Ludovic Dickel,et al. Food imprinting, new evidence from the cuttlefish Sepia officinalis , 2006, Biology Letters.

[8] Mark H Johnson,et al. Sensitive periods in functional brain development: problems and prospects. , 2005, Developmental psychobiology.

[9] Geoffrey J. Gordon,et al. Artificial Intelligence in Medicine: 17th Conference on Artificial Intelligence in Medicine, AIME 2019, Poznan, Poland, June 26–29, 2019, Proceedings , 2019, Lecture Notes in Computer Science.

[10] Takahiro Sasaki,et al. Evolving Learnable Neural Networks Under Changing Environments with Various Rates of Inheritance of Acquired Characters: Comparison of Darwinian and Lamarckian Evolution , 1999, Artificial Life.

[11] John A. Bullinaria. Lifetime Learning as a Factor in Life History Evolution , 2009, Artificial Life.

[12] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[13] Petr E. Komers,et al. Behavioural plasticity in variable environments , 1997 .

[14] V. Mountcastle. The columnar organization of the neocortex. , 1997, Brain : a journal of neurology.

[15] Paul M. Brunet,et al. What is so critical?: a commentary on the reexamination of critical periods. , 2006, Developmental psychobiology.

[16] Giles Mayley,et al. Landscapes, Learning Costs, and Genetic Assimilation , 1996, Evolutionary Computation.

[17] Anthony V. Robins,et al. Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[18] Andreas Wagner,et al. Specialization Can Drive the Evolution of Modularity , 2010, PLoS Comput. Biol..

[19] Kenneth O. Stanley,et al. Constraining connectivity to encourage modularity in HyperNEAT , 2011, GECCO '11.

[20] John A. Bullinaria,et al. The Evolution of Minimal Catastrophic Forgetting in Neural Systems , 2005 .

[21] J. Baldwin. A New Factor in Evolution , 1896, The American Naturalist.

[22] Boye Annfelt Høverstad,et al. Noise and the Evolution of Neural Network Modularity , 2011, Artificial Life.

[23] Risto Miikkulainen,et al. Active Guidance for a Finless Rocket Using Neuroevolution , 2003, GECCO.

[24] X. Yao. Evolving Artificial Neural Networks , 1999 .

[25] Kalyanmoy Deb,et al. A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[26] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[27] Yong-Yeol Ahn,et al. Wiring cost in the organization of a biological neuronal network , 2005, q-bio/0505009.

[28] E. Cashdan,et al. A sensitive period for learning about food , 1994, Human nature.

[29] Jean-Baptiste Mouret,et al. On the relationships between synaptic plasticity and generative systems , 2011, GECCO '11.

[30] S. Kirby,et al. The evolution of incremental learning: language, development and critical periods , 1997 .

[31] Y Trotter,et al. Recovery of orientation selectivity in kitten primary visual cortex is slowed down by bilateral section of ophthalmic trigeminal afferents. , 1981, Brain research.

[32] Janet Wiles,et al. The rise and fall of learning: a neural network model of the genetic assimilation of acquired traits , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[33] W. Greenough,et al. Experience-driven brain plasticity: beyond the synapse. , 2004, Neuron glia biology.

[34] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[35] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[36] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[37] C. Bishop. The MIT Encyclopedia of the Cognitive Sciences , 1999 .

[38] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[39] Dario Floreano,et al. Evolution of Adaptive Synapses: Robots with Fast Adaptive Behavior in New Environments , 2001, Evolutionary Computation.

[40] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[41] Bernard Ans,et al. Neural networks with a self-refreshing memory: Knowledge transfer in sequential learning tasks without catastrophic forgetting , 2000, Connect. Sci..

[42] Jason D. Lohn,et al. Computer-Automated Evolution of an X-Band Antenna for NASA's Space Technology 5 Mission , 2011, Evolutionary Computation.

[43] Peter M. Todd,et al. Exploring adaptive agency II: simulating the evolution of associative learning , 1991 .

[44] A. D. Bradshaw,et al. Evolutionary Significance of Phenotypic Plasticity in Plants , 1965 .

[45] E. Rolls,et al. Computational models of schizophrenia and dopamine modulation in the prefrontal cortex , 2008, Nature Reviews Neuroscience.

[46] L’oubli catastrophique it,et al. Avoiding catastrophic forgetting by coupling two reverberating neural networks , 2004 .

[47] Randall D. Beer,et al. A Dynamical Systems Perspective on Agent-Environment Interaction , 1995, Artif. Intell..

[48] S. Ge,et al. A Critical Period for Enhanced Synaptic Plasticity in Newly Generated Neurons of the Adult Brain , 2007, Neuron.

[49] R Ratcliff,et al. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.

[50] G. Wagner,et al. The road to modularity , 2007, Nature Reviews Genetics.

[51] B. Burrell,et al. Learning in simple systems , 2001, Current Opinion in Neurobiology.

[52] G. Michel,et al. Critical period: a history of the transition from questions of when, to what, to how. , 2005, Developmental psychobiology.

[53] L. Abbott,et al. Synaptic computation , 2004, Nature.

[54] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[55] Charles Ofria,et al. Evolving coordinated quadruped gaits with the HyperNEAT generative encoding , 2009, 2009 IEEE Congress on Evolutionary Computation.

[56] Frederic Mery,et al. Experimental evolution of learning ability in fruit flies , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[57] Arend Hintze,et al. Evolution of Complex Modular Biological Networks , 2007, PLoS Comput. Biol..

[58] James R. Hurford,et al. The evolution of the critical period for language acquisition , 1991, Cognition.

[59] G. Striedter. Principles of brain evolution. , 2005 .

[60] Hod Lipson,et al. Principles of modularity, regularity, and hierarchy for scalable systems , 2007 .

[61] F. Punzo,et al. Food imprinting and subsequent prey preference in the lynx spider, Oxyopes salticus (Araneae: Oxyopidae) , 2002, Behavioural Processes.

[62] Hod Lipson,et al. The evolutionary origins of modularity , 2012, Proceedings of the Royal Society B: Biological Sciences.

[63] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[64] Dario Floreano,et al. Evolving neuromodulatory topologies for reinforcement learning-like problems , 2007, 2007 IEEE Congress on Evolutionary Computation.

[65] Frederic Mery,et al. THE EFFECT OF LEARNING ON EXPERIMENTAL EVOLUTION OF RESOURCE PREFERENCE IN DROSOPHILA MELANOGASTER , 2004, Evolution; international journal of organic evolution.

[66] Dario Floreano,et al. Levels of dynamics and adaptive behavior in evolutionary neural controllers , 2002 .

[67] Giles Mayley. The Evolutionary Cost of Learning , 1996 .

[68] V. Ramakrishnan,et al. Measurement of the top-quark mass with dilepton events selected using neuroevolution at CDF. , 2008, Physical review letters.

[69] E. Bizzi,et al. A theory for how sensorimotor skills are learned and retained in noisy and nonstationary neural circuits , 2013, Proceedings of the National Academy of Sciences.

[70] Robert Anemone,et al. Finding fossils in new ways: An artificial neural network approach to predicting the location of productive fossil localities , 2011, Evolutionary anthropology.

[71] Jeffrey L. Krichmar,et al. Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines , 2001, Complex..

[72] John R. Koza,et al. Genetic Programming IV: Routine Human-Competitive Machine Intelligence , 2003 .

[73] Anthony Kulis,et al. Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies , 2009, Scalable Comput. Pract. Exp..

[74] D. Wilson,et al. Costs and limits of phenotypic plasticity. , 1998, Trends in ecology & evolution.

[75] R. K. Ursem. Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[76] Charles E. Hughes,et al. How novelty search escapes the deceptive trap of learning to learn , 2009, GECCO.

[77] Jean-Marc Fellous,et al. Computational Models of Neuromodulation , 1998, Neural Computation.

[78] E. Knudsen. Sensitive Periods in the Development of the Brain and Behavior , 2004, Journal of Cognitive Neuroscience.

[79] Robert M. French,et al. Pseudo-recurrent Connectionist Networks: An Approach to the 'Sensitivity-Stability' Dilemma , 1997, Connect. Sci..

[80] K. Lorenz. The Companion in the Bird's World , 1937 .

[81] D. Stephens,et al. Components of change in the evolution of learning and unlearned preference , 2009, Proceedings of the Royal Society B: Biological Sciences.

[82] J. Bullinaria. From biological models to the evolution of robot control systems , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[83] Frederic Mery,et al. A fitness cost of learning ability in Drosophila melanogaster , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[84] E. Smith,et al. Multiple sensitive periods in the development of the primate visual system. , 1986, Science.

[85] Dario Floreano,et al. Evolution of Plastic Control Networks , 2001, Auton. Robots.

[86] M E J Newman,et al. Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[87] B. Underwood,et al. Fate of first-list associations in transfer theory. , 1959, Journal of experimental psychology.

[88] Jeffrey L. Elman,et al. Learning and Evolution in Neural Networks , 1994, Adapt. Behav..

[89] Janet Wiles,et al. Stability and task complexity: a neural network model of genetic assimilation , 2002 .

[90] Peter D. Turney. Myths and Legends of the Baldwin Effect , 2002, ICML 2002.

[91] Stefano Nolfi,et al. Learning to Adapt to Changing Environments in Evolving Neural Networks , 1996, Adapt. Behav..

[92] Stéphane Doncieux,et al. Encouraging Behavioral Diversity in Evolutionary Robotics: An Empirical Study , 2012, Evolutionary Computation.

[93] D. Maurer,et al. Multiple sensitive periods in human visual development: evidence from visually deprived children. , 2005, Developmental psychobiology.

[94] Angelo Cangelosi,et al. The Emergence of a 'Language' in an Evolving Population of Neural Networks , 1998, Connect. Sci..

[95] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .

[96] Robert M. French,et al. Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .

[97] G. Burghardt,et al. Food Imprinting in the Snapping Turtle, Chelydra serpentina , 1966, Science.

[98] T. Jay. Dopamine: a potential substrate for synaptic plasticity and memory mechanisms , 2003, Progress in Neurobiology.

[99] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[100] D. Hubel,et al. The period of susceptibility to the physiological effects of unilateral eye closure in kittens , 1970, The Journal of physiology.

[101] R. French. Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.

[102] Michael L. Littman,et al. Simulations combining evolution and learning , 1996 .

[103] M. Pigliucci. Is evolvability evolvable? , 2008, Nature Reviews Genetics.

[104] T. Hensch. Critical period plasticity in local cortical circuits , 2005, Nature Reviews Neuroscience.

[105] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[106] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[107] M. F.,et al. Bibliography , 1985, Experimental Gerontology.

[108] Kai Olav Ellefsen. Balancing the Costs and Benefits of Learning Ability , 2013, ECAL.

[109] Marcus W Feldman,et al. Carving the cognitive niche: optimal learning strategies in homogeneous and heterogeneous environments. , 2003, Journal of theoretical biology.

[110] Andrea Soltoggio. Neural Plasticity and Minimal Topologies for Reward-Based Learning , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[111] S. Lewandowsky,et al. Catastrophic interference in neural networks , 1995 .

[112] Geoffrey E. Hinton,et al. How Learning Can Guide Evolution , 1996, Complex Syst..

[113] Stéphane Doncieux,et al. Sferesv2: Evolvin' in the multi-core world , 2010, IEEE Congress on Evolutionary Computation.

[114] E. Capaldi,et al. The organization of behavior. , 1992, Journal of applied behavior analysis.

[115] U. Alon,et al. Spontaneous evolution of modularity and network motifs. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[116] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[117] Masahiro Fujita,et al. Autonomous evolution of dynamic gaits with two quadruped robots , 2005, IEEE Transactions on Robotics.

[118] Isaac Meilijson,et al. Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..

[119] G. Roth,et al. Evolution of the brain and intelligence , 2005, Trends in Cognitive Sciences.

[120] J. Werker,et al. Speech perception as a window for understanding plasticity and commitment in language systems of the brain. , 2005, Developmental psychobiology.

[121] S. Carroll. Chance and necessity: the evolution of morphological complexity and diversity , 2001, Nature.

[122] Stefano Nolfi,et al. Competitive co-evolutionary robotics: from theory to practice , 1998 .

[123] S.J.J. Smith,et al. Empirical Methods for Artificial Intelligence , 1995 .

[124] Keiji Tanaka,et al. Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.