Machine Learned Learning Machines

There are two common approaches for optimizing the performance of a machine: genetic algorithms and machine learning. A genetic algorithm is applied over many generations whereas machine learning works by applying feedback until the system meets a performance threshold. Though these are methods that typically operate separately, we combine evolutionary adaptation and machine learning into one approach. Our focus is on machines that can learn during their lifetime, but instead of equipping them with a machine learning algorithm we aim to let them evolve their ability to learn by themselves. We use evolvable networks of probabilistic and deterministic logic gates, known as Markov Brains, as our computational model organism. The ability of Markov Brains to learn is augmented by a novel adaptive component that can change its computational behavior based on feedback. We show that Markov Brains can indeed evolve to incorporate these feedback gates to improve their adaptability to variable environments. By combining these two methods, we now also implemented a computational model that can be used to study the evolution of learning.

[1]  A. Clark Being There: Putting Brain, Body, and World Together Again , 1996 .

[2]  J. Baldwin A New Factor in Evolution , 1896, The American Naturalist.

[3]  Christof Koch,et al.  The Minimal Complexity of Adapting Agents Increases with Fitness , 2012, ALIFE.

[4]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[5]  Hiroyuki Kamaya,et al.  Labeling Q-learning in hidden state environments , 2006, Artificial Life and Robotics.

[6]  L. Nadel,et al.  Update on Memory Systems and Processes , 2011, Neuropsychopharmacology.

[7]  L. Squire,et al.  The cognitive neuroscience of human memory since H.M. , 2011, Annual review of neuroscience.

[8]  Arend Hintze,et al.  Predator confusion is sufficient to evolve swarming behaviour , 2012, Journal of The Royal Society Interface.

[9]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[10]  Alan D. Blair,et al.  Evolving Plastic Neural Networks for Online Learning: Review and Future Directions , 2012, Australasian Conference on Artificial Intelligence.

[11]  Kenneth O. Stanley,et al.  Autonomous Evolution of Topographic Regularities in Artificial Neural Networks , 2010, Neural Computation.

[12]  Emil Juul Jacobsen,et al.  Evolving Neural Turing Machines , 2015 .

[13]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[14]  Lothar Thiele,et al.  A Comparison of Selection Schemes Used in Evolutionary Algorithms , 1996, Evolutionary Computation.

[15]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[16]  Mauro Santos,et al.  Phenotypic plasticity, the Baldwin effect, and the speeding up of evolution: the computational roots of an illusion. , 2014, Journal of theoretical biology.

[17]  R. French Catastrophic Forgetting in Connectionist Networks , 2006 .

[18]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[19]  Paul M. Baggenstoss A modified Baum-Welch algorithm for hidden Markov models with multiple observation spaces , 2001, IEEE Trans. Speech Audio Process..

[20]  W. Ma,et al.  Changing concepts of working memory , 2014, Nature Neuroscience.

[21]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[22]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[23]  Robert T. Pennock,et al.  The evolutionary origin of complex features , 2003, Nature.

[24]  H. Eichenbaum,et al.  Consolidation and Reconsolidation: Two Lives of Memories? , 2011, Neuron.

[25]  Arend Hintze,et al.  Computational evolution of decision-making strategies , 2015, CogSci.

[26]  Arend Hintze,et al.  Evolution of an artificial visual cortex for image recognition , 2013, ECAL.

[27]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[28]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Fang Wang,et al.  Neural Control of a Tracking Task via Attention-Gated Reinforcement Learning for Brain-Machine Interfaces , 2015, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[30]  Jürgen Schmidhuber,et al.  Evolving Modular Fast-Weight Networks for Control , 2005, ICANN.

[31]  D. Stephens,et al.  Reliability, uncertainty, and costs in the evolution of animal learning , 2016, Current Opinion in Behavioral Sciences.

[32]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[33]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[34]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[35]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[36]  M. Egas,et al.  How Adaptive Learning Affects Evolution: Reviewing Theory on the Baldwin Effect , 2011, Evolutionary Biology.

[37]  Shie Mannor,et al.  Contextual Markov Decision Processes , 2015, ArXiv.

[38]  Christoph Adami,et al.  Distributed under Creative Commons Cc-by 4.0 the Evolution of Logic Circuits for the Purpose of Protein Contact Map Prediction , 2022 .

[39]  Arend Hintze,et al.  Rewards, Risks, and Reaching the Right Strategy: Evolutionary Paths From Heuristics to Optimal Decisions , 2018, Evolutionary Behavioral Sciences.

[40]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[41]  Arend Hintze,et al.  Evolution of Integrated Causal Structures in Animats Exposed to Environments of Increasing Complexity , 2014, PLoS Comput. Biol..

[42]  E. Kandel,et al.  The Molecular and Systems Biology of Memory , 2014, Cell.

[43]  Jean-Baptiste Mouret,et al.  On the relationships between synaptic plasticity and generative systems , 2011, GECCO '11.

[44]  S. Risi,et al.  Continual Learning through Evolvable Neural Turing Machines , 2016 .

[45]  Arend Hintze,et al.  Information-theoretic neuro-correlates boost evolution of cognitive systems , 2015, Entropy.

[46]  Tom M. Mitchell,et al.  Reinforcement learning with hidden states , 1993 .

[47]  J. Gabrieli Cognitive neuroscience of human memory. , 1998, Annual review of psychology.

[48]  Risto Miikkulainen,et al.  Evolving adaptive neural networks with and without adaptive synapses , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[49]  Ron Meir,et al.  The Effect of Learning on the Evolution of Asexual Populations , 1990, Complex Syst..

[50]  Arend Hintze,et al.  Integrated Information Increases with Fitness in the Evolution of Animats , 2011, PLoS Comput. Biol..

[51]  Jean-Baptiste Mouret,et al.  Neural Modularity Helps Organisms Evolve to Learn New Skills without Forgetting Old Skills , 2015, PLoS Comput. Biol..

[52]  Arend Hintze,et al.  Evolution of Autonomous Hierarchy Formation and Maintenance , 2014, ALIFE.

[53]  Arend Hintze,et al.  The Evolution of Representation in Simple Cognitive Networks , 2012, Neural Computation.

[54]  Sebastian Risi,et al.  A unified approach to evolving plasticity and neural geometry , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[55]  W. Abraham,et al.  Memory retention – the synaptic stability versus plasticity dilemma , 2005, Trends in Neurosciences.

[56]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[57]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[58]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[59]  Lee Spector,et al.  Genetic Programming and Autoconstructive Evolution with the Push Programming Language , 2002, Genetic Programming and Evolvable Machines.

[60]  Ghassan Kawas Kaleh,et al.  Joint parameter estimation and symbol detection for linear or nonlinear unknown channels , 1994, IEEE Trans. Commun..

[61]  Sebastian Risi,et al.  Evolving Neural Turing Machines for Reward-based Learning , 2016, GECCO.

[62]  Wei Wu,et al.  Boundedness and convergence of batch back-propagation algorithm with penalty for feedforward neural networks , 2012, Neurocomputing.

[63]  R. Andrew McCallum,et al.  Hidden state and reinforcement learning with instance-based state identification , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[64]  Dario Floreano,et al.  Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios , 2008, ALIFE.

[65]  Yonatan Loewenstein,et al.  The Misbehavior of Reinforcement Learning , 2014, Proceedings of the IEEE.

[66]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[67]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[68]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[69]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[70]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[71]  Dario Floreano,et al.  Evolution of Adaptive Synapses: Robots with Fast Adaptive Behavior in New Environments , 2001, Evolutionary Computation.

[72]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[73]  Karl Sims,et al.  Evolving virtual creatures , 1994, SIGGRAPH.