Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

Contemporary sensorimotor learning approaches typically start with an existing complex agent (e.g., a robotic arm), which they learn to control. In contrast, this paper investigates a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies. Each primitive agent consists of a limb with a motor attached at one end. Limbs may choose to link up to form collectives. When a limb initiates a link-up action, and there is another limb nearby, the latter is magnetically connected to the 'parent' limb's motor. This forms a new single agent, which may further link with other agents. In this way, complex morphologies can emerge, controlled by a policy whose architecture is in explicit correspondence with the morphology. We evaluate the performance of these dynamic and modular agents in simulated environments. We demonstrate better generalization to test-time changes both in the environment, as well as in the structure of the agent, compared to static and monolithic baselines. Project video and code are available at this https URL

[1]  J. Schwartz,et al.  Theory of Self-Reproducing Automata , 1967 .

[2]  J. Davies,et al.  Molecular Biology of the Cell , 1983, Bristol Medico-Chirurgical Journal.

[3]  Karl Sims,et al.  Evolving virtual creatures , 1994, SIGGRAPH.

[4]  Demetri Terzopoulos,et al.  Artificial fishes: physics, locomotion, perception, behavior , 1994, SIGGRAPH.

[5]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[6]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[7]  Mark Yim,et al.  PolyBot: a modular reconfigurable robot , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[8]  Pedro Larrañaga,et al.  An Introduction to Probabilistic Graphical Models , 2002, Estimation of Distribution Algorithms.

[9]  Barbara Messing,et al.  An Introduction to MultiAgent Systems , 2002, Künstliche Intell..

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Iuliu Vasilescu,et al.  Miche: Modular Shape Formation by Self-Disassembly , 2008, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[12]  Howie Choset,et al.  Design of a modular snake robot , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Gregory S. Chirikjian,et al.  Modular Self-Reconfigurable Robot Systems [Grand Challenges of Robotics] , 2007, IEEE Robotics & Automation Magazine.

[14]  Satoshi Murata,et al.  Self-reconfigurable robots , 2007, IEEE Robotics & Automation Magazine.

[15]  Gregory S. Chirikjian,et al.  Modular Self-Reconfigurable Robot Systems , 2007 .

[16]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[17]  K. Wampler,et al.  Optimal gait and form for animal locomotion , 2009, SIGGRAPH 2009.

[18]  Aaron Hertzmann,et al.  Feature-based locomotion controllers , 2010, SIGGRAPH 2010.

[19]  Kasper Stoy,et al.  Self-Reconfigurable Robots: An Introduction , 2010 .

[20]  Daniela Rus,et al.  M-blocks: Momentum-driven, magnetic modular robots , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Sebastian Tschiatschek,et al.  Introduction to Probabilistic Graphical Models , 2014 .

[22]  Dan Klein,et al.  Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[24]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[25]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[26]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[27]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[28]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[29]  Tao Chen,et al.  Hardware Conditioned Policies for Multi-Robot Transfer Learning , 2018, NeurIPS.

[30]  Sanja Fidler,et al.  NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.

[31]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[32]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[33]  Yann LeCun,et al.  GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations , 2018, ArXiv.

[34]  Mark Yim,et al.  An integrated system for perception-driven autonomy with modular robots , 2018, Science Robotics.

[35]  R. Milo,et al.  The biomass distribution on Earth , 2018, Proceedings of the National Academy of Sciences.

[36]  Jakub W. Pachocki,et al.  Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[37]  Matthew R. Walter,et al.  Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[38]  Silvio Savarese,et al.  Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Rui Wang,et al.  Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.

[40]  David Ha,et al.  Reinforcement Learning for Improving Agent Design , 2018, Artificial Life.