Residual Model Learning for Microrobot Control

A majority of microrobots are constructed using compliant materials that are difficult to model analytically, limiting the utility of traditional model-based controllers. Challenges in data collection on microrobots and large errors between simulated models and real robots make current model-based learning and sim-to-real transfer methods difficult to apply. We propose a novel framework residual model learning (RML) that leverages approximate models to substantially reduce the sample complexity associated with learning an accurate robot model. We show that using RML, we can learn a model of the Harvard Ambulatory MicroRobot (HAMR) using just 12 seconds of passively collected interaction data. The learned model is accurate enough to be leveraged as "proxy-simulator" for learning walking and turning behaviors using model-free reinforcement learning algorithms. RML provides a general framework for learning from extremely small amounts of interaction data, and our experiments with HAMR clearly demonstrate that RML substantially outperforms existing techniques.

[1]  Aaron M. Dollar,et al.  The Smooth Curvature Model: An Efficient Representation of Euler–Bernoulli Flexures as Robot Joints , 2012, IEEE Transactions on Robotics.

[2]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[3]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[4]  Peter Stone,et al.  Stochastic Grounded Action Transformation for Robot Learning in Simulation , 2017, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Neel Doshi,et al.  High speed trajectory control using an experimental maneuverability model for an insect-scale legged robot , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Robert J. Wood,et al.  Design and Fabrication of the Harvard Ambulatory Micro-Robot , 2009, ISRR.

[7]  Andrew J. Davison,et al.  Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  N. Lobontiu Compliant Mechanisms , 2020 .

[10]  Neel Doshi,et al.  Contact-Implicit Optimization of Locomotion Trajectories for a Quadrupedal Microrobot , 2018, Robotics: Science and Systems.

[11]  Taylor Apgar,et al.  Fast Online Trajectory Optimization for the Bipedal Robot Cassie , 2018, Robotics: Science and Systems.

[12]  Pierre-Yves Oudeyer,et al.  Sim-to-Real Transfer with Neural-Augmented Robot Simulation , 2018, CoRL.

[13]  Nima Fazeli,et al.  Learning Data-Efficient Rigid-Body Contact Models: Case Study of Planar Impact , 2017, CoRL.

[14]  Sergey Levine,et al.  Learning Flexible and Reusable Locomotion Primitives for a Microrobot , 2018, IEEE Robotics and Automation Letters.

[15]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[16]  Ronald S. Fearing,et al.  Fast scale prototyping for folded millirobots , 2008, 2008 IEEE International Conference on Robotics and Automation.

[17]  Robert J. Wood,et al.  System identification and linear time-invariant modeling of an insect-sized flapping-wing micro air vehicle , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Leslie Pack Kaelbling,et al.  Residual Policy Learning , 2018, ArXiv.

[19]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[20]  Nima Fazeli,et al.  Long-Horizon Prediction and Uncertainty Propagation with Residual Point Contact Learners , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[22]  Yuval Tassa,et al.  Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.

[23]  Sergey Levine,et al.  Residual Reinforcement Learning for Robot Control , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[24]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[25]  Roland Siegwart,et al.  Trajectory Optimization for Wheeled-Legged Quadrupedal Robots Driving in Challenging Terrain , 2020, IEEE Robotics and Automation Letters.

[26]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[27]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Emanuel Todorov,et al.  Physically consistent state estimation and system identification for contacts , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[30]  Robert J. Wood,et al.  Monolithic fabrication of millimeter-scale machines , 2012 .

[31]  Alberto Rodriguez,et al.  TossingBot: Learning to Throw Arbitrary Objects With Residual Physics , 2019, IEEE Transactions on Robotics.

[32]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Jiajun Wu,et al.  Combining Physical Simulators and Object-Based Networks for Control , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[34]  Bernard Ghanem,et al.  Driving Policy Transfer via Modularity and Abstraction , 2018, CoRL.

[35]  Robert J. Wood,et al.  Passive undulatory gaits enhance walking in a myriapod millirobot , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[37]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Sergey Levine,et al.  Learning to Walk in the Real World with Minimal Human Effort , 2020, CoRL.

[39]  Allan Jabri,et al.  Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Gert Kootstra,et al.  International Conference on Robotics and Automation (ICRA) , 2008, ICRA 2008.

[41]  Russ Tedrake,et al.  A direct method for trajectory optimization of rigid bodies through contact , 2014, Int. J. Robotics Res..

[42]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[43]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[44]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[45]  Mark R. Cutkosky,et al.  Comparing the Locomotion Dynamics of the Cockroach and a Shape Deposition Manufactured Biomimetic Hexapod , 2000, ISER.

[46]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[47]  Dieter Fox,et al.  Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects , 2018, CoRL.

[48]  R. Wood,et al.  Design and manufacturing rules for maximizing the performance of polycrystalline piezoelectric bending actuators , 2015 .

[49]  Tao Chen,et al.  Hardware Conditioned Policies for Multi-Robot Transfer Learning , 2018, NeurIPS.

[50]  Marco Hutter,et al.  Trajectory Optimization for Legged Robots With Slipping Motions , 2019, IEEE Robotics and Automation Letters.