Embodied intelligence via learning and evolution

The intertwined processes of learning and evolution in complex environmental niches have resulted in a remarkable diversity of morphological forms. Moreover, many aspects of animal intelligence are deeply embodied in these evolved morphologies. However, the principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control, remain elusive, because performing large-scale in silico experiments on evolution and learning is challenging. Here, we introduce Deep Evolutionary Reinforcement Learning (DERL): a computational framework which can evolve diverse agent morphologies to learn challenging locomotion and manipulation tasks in complex environments. Leveraging DERL we demonstrate several relations between environmental complexity, morphological intelligence and the learnability of control. First, environmental complexity fosters the evolution of morphological intelligence as quantified by the ability of a morphology to facilitate the learning of novel tasks. Second, we demonstrate a morphological Baldwin effect i.e., in our simulations evolution rapidly selects morphologies that learn faster, thereby enabling behaviors learned late in the lifetime of early ancestors to be expressed early in the descendants lifetime. Third, we suggest a mechanistic basis for the above relationships through the evolution of morphologies that are more physically stable and energy efficient, and can therefore facilitate learning and control.

[1]  Josh C. Bongard,et al.  How morphological development can guide evolution , 2017, Scientific Reports.

[2]  Gregory Hornby,et al.  ALPS: the age-layered population structure for reducing the problem of premature convergence , 2006, GECCO.

[3]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[4]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[5]  A. E. Eiben,et al.  Evolving-Controllers Versus Learning-Controllers for Morphologically Evolvable Robots , 2020, EvoApplications.

[6]  Mohammad Norouzi,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[7]  David Ha,et al.  Reinforcement Learning for Improving Agent Design , 2018, Artificial Life.

[8]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[9]  Hod Lipson,et al.  Scalable co-optimization of morphology and control in embodied machines , 2017, Journal of The Royal Society Interface.

[10]  Rolf Pfeifer,et al.  Understanding intelligence , 2020, Inequality by Design.

[11]  Kenneth O. Stanley,et al.  Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Enrique Alba,et al.  Parallelism and evolutionary algorithms , 2002, IEEE Trans. Evol. Comput..

[14]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Wenlong Huang,et al.  One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control , 2020, ICML.

[16]  Karine Miras,et al.  Environmental influences on evolvable robots , 2020, PloS one.

[17]  M. Del Giudice,et al.  Programmed to learn? The ontogeny of mirror neurons. , 2009, Developmental science.

[18]  Joshua Evan Auerbach,et al.  Environmental Influence on the Evolution of Morphological Complexity in Machines , 2014, PLoS Comput. Biol..

[19]  Matthew R. Walter,et al.  Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[20]  H. Muller Some Genetic Aspects of Sex , 1932, The American Naturalist.

[21]  A. Weismann,et al.  The Germ-Plasm: A Theory of Heredity , 1981 .

[22]  R. M. Alexander Models and the scaling of energy costs for locomotion , 2005, Journal of Experimental Biology.

[23]  R. Anderson,et al.  Learning and evolution: a quantitative genetics approach. , 1995, Journal of theoretical biology.

[24]  Enrique Alba,et al.  Parallel Metaheuristics: A New Class of Algorithms , 2005 .

[25]  Koushil Sreenath,et al.  Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[26]  C. Waddington Canalization of Development and the Inheritance of Acquired Characters , 1942, Nature.

[27]  Josh C. Bongard,et al.  Automated shapeshifting for function recovery in damaged robots , 2019, Robotics: Science and Systems.

[28]  David Howard,et al.  A Review of Physics Simulators for Robotic Applications , 2021, IEEE Access.

[29]  J. Baldwin A New Factor in Evolution , 1896, The American Naturalist.

[30]  A. E. Eiben,et al.  Lamarckian Evolution of Simulated Modular Robots , 2019, Front. Robot. AI.

[31]  R A Brooks,et al.  New Approaches to Robotics , 1991, Science.

[32]  Hod Lipson,et al.  Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding , 2013, GECCO '13.

[33]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[34]  Yuval Tassa,et al.  dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.

[35]  Jordan B. Pollack,et al.  Automatic design and manufacture of robotic lifeforms , 2000, Nature.

[36]  Donald Favareau The Symbolic Species: The Co-evolution of Language and the Brain , 1998 .

[37]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[38]  Joshua Evan Auerbach,et al.  On the relationship between environmental and morphological complexity in evolved robots , 2012, GECCO '12.

[39]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[40]  Wojciech Matusik,et al.  RoboGrammar: graph grammar for terrain-optimized robot design , 2020, ACM Trans. Graph..

[41]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[42]  D. Rus,et al.  Design, fabrication and control of soft robots , 2015, Nature.

[43]  A. Gray,et al.  I. THE ORIGIN OF SPECIES BY MEANS OF NATURAL SELECTION , 1963 .

[44]  Dario Floreano,et al.  RoboGen: Robot Generation through Artificial Evolution , 2014, ALIFE.

[45]  Heni Ben Amor,et al.  Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning , 2019, CoRL.

[46]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Giles Mayley,et al.  Landscapes, Learning Costs, and Genetic Assimilation , 1996, Evolutionary Computation.

[48]  R. McGhee,et al.  On the stability properties of quadruped creeping gaits , 1968 .

[49]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[50]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[51]  Inman Harvey,et al.  Why Morphology Matters , 2014 .

[52]  LipsonHod,et al.  Dynamic Simulation of Soft Multimaterial 3D-Printed Objects , 2014 .

[53]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[54]  Eric Medvet,et al.  2D-VSR-Sim: A simulation tool for the optimization of 2-D voxel-based soft robots , 2020, SoftwareX.

[55]  Serge Kernbach,et al.  Embodied artificial evolution , 2012, Evolutionary Intelligence.

[56]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[57]  Hod Lipson,et al.  Unshackling evolution , 2014 .

[58]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[59]  Mary L. Droser,et al.  Discovery of the oldest bilaterian from the Ediacaran of South Australia , 2020, Proceedings of the National Academy of Sciences.

[60]  Mark Chen,et al.  Scaling Laws for Autoregressive Generative Modeling , 2020, ArXiv.

[61]  David H. Ackley,et al.  Interactions between learning and evolution , 1991 .

[62]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[63]  Karl Sims,et al.  Evolving 3D Morphology and Behavior by Competition , 1994, Artificial Life.

[64]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[65]  Alec Radford,et al.  Scaling Laws for Neural Language Models , 2020, ArXiv.

[66]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.