SAGCI-System: Towards Sample-Efficient, Generalizable, Compositional, and Incremental Robot Learning

Building general-purpose robots to perform an enormous amount of tasks in a large variety of environments at the human level is notoriously complicated. According to [1], it requires the robot learning to be sample-efficient, generalizable, compositional, and incremental. In this work, we introduce a systematic learning framework called SAGCI-system towards achieving these above four requirements. Our system first takes the raw point clouds gathered by the camera mounted on the robot’s wrist as the inputs and produces initial modeling of the surrounding environment represented as a URDF. Our system adopts a learning-augmented differentiable simulation that loads the URDF. The robot then utilizes the interactive perception to interact with the environments to online verify and modify the URDF. Leveraging the simulation, we propose a new model-based RL algorithm combining object-centric and robot-centric approaches to efficiently produce policies to accomplish manipulation tasks. We apply our system to perform articulated object manipulation, both in the simulation and the real world. Extensive experiments demonstrate the effectiveness of our proposed learning framework. Supplemental materials and videos are available on our project webpage https://sites.google.com/view/egci

[1]  Joshua B. Tenenbaum,et al.  End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[2]  Oliver Brock,et al.  Entropy-based strategies for physical exploration of the environment's degrees of freedom , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Leonidas J. Guibas,et al.  FlowNet3D: Learning Scene Flow in 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[5]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Connor Schenck,et al.  SPNets: Differentiable Fluid Dynamics for Deep Neural Networks , 2018, CoRL.

[7]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[8]  Pieter Abbeel,et al.  Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.

[9]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[10]  Jonas Degrave,et al.  A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[11]  Leonidas J. Guibas,et al.  SAPIEN: A SimulAted Part-Based Interactive ENvironment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Leslie Pack Kaelbling,et al.  The foundation of efficient robot learning , 2020, Science.

[13]  Yichao Zhou,et al.  ManifoldPlus: A Robust and Scalable Watertight Manifold Surface Generation Method for Triangle Soups , 2020, ArXiv.

[14]  Leslie Pack Kaelbling,et al.  Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Frédo Durand,et al.  Taichi , 2019, ACM Trans. Graph..

[16]  C. Karen Liu,et al.  Fast and Feature-Complete Differentiable Physics for Articulated Rigid Bodies with Contact , 2021, ArXiv.

[17]  Gaurav S. Sukhatme,et al.  Active articulation model estimation through interactive perception , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20]  Roland Siegwart,et al.  Object Finding in Cluttered Scenes Using Interactive Perception , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Advait Jain,et al.  Pulling open doors and drawers: Coordinating an omni-directional base and a compliant arm with Equilibrium Point control , 2010, 2010 IEEE International Conference on Robotics and Automation.

[22]  Oliver Brock,et al.  Interactive Perception: Leveraging Action in Perception and Perception in Action , 2016, IEEE Transactions on Robotics.

[23]  Vladlen Koltun,et al.  Learning to Control PDEs with Differentiable Physics , 2020, ICLR.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[26]  Ming C. Lin,et al.  Differentiable Cloth Simulation for Inverse Problems , 2019, NeurIPS.

[27]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[28]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[29]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[30]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[31]  Catholijn M. Jonker,et al.  Model-based Reinforcement Learning: A Survey , 2020, ArXiv.

[32]  Bernhard Thomaszewski,et al.  ADD , 2020, ACM Trans. Graph..

[33]  Daniel L. K. Yamins,et al.  Flexible Neural Representation for Physics Prediction , 2018, NeurIPS.

[34]  Gaurav S. Sukhatme,et al.  NeuralSim: Augmenting Differentiable Simulators with Neural Networks , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Marc Toussaint,et al.  Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning - Extended Abtract , 2019, IJCAI.

[36]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[37]  Jiancheng Liu,et al.  ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[38]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[39]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[40]  Oliver Brock,et al.  Online interactive perception of articulated objects with multi-level recursive estimation based on task-specific priors , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[42]  Ming C. Lin,et al.  Scalable Differentiable Physics for Learning and Control , 2020, ICML.