Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning

Prediction is an appealing objective for self-supervised learning of behavioral skills, particularly for autonomous robots. However, effectively utilizing predictive models for control, especially with raw image inputs, poses a number of major challenges. How should the predictions be used? What happens when they are inaccurate? In this paper, we tackle these questions by proposing a method for learning robotic skills from raw image observations, using only autonomously collected experience. We show that even an imperfect model can complete complex tasks if it can continuously retry, but this requires the model to not lose track of the objective (e.g., the object of interest). To enable a robot to continuously retry a task, we devise a self-supervised algorithm for learning image registration, which can keep track of objects of interest for the duration of the trial. We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation. Our real-world experiments demonstrate that a model trained with 160 robot hours of autonomously collected, unlabeled data is able to successfully perform complex manipulation tasks with a wide range of objects not seen during training.

[1]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[4]  François Chaumette,et al.  Predictive Control for Constrained Image-Based Visual Servoing , 2010, IEEE Transactions on Robotics.

[5]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[7]  Danica Kragic,et al.  Survey on Visual Servoing for Manipulation , 2002 .

[8]  Sergey Levine,et al.  Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[9]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[10]  Dirk P. Kroese,et al.  The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .

[11]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  K. Madhava Krishna,et al.  Exploring convolutional networks for end-to-end visual servoing , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[14]  Giulio Sandini,et al.  A Vision-Based Learning Method for Pushing Manipulation , 1993 .

[15]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  Franziska Meier,et al.  SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control , 2017, ArXiv.

[18]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[19]  Koichiro Deguchi,et al.  Image-based simultaneous control of robot and target object motions by direct-image-interpretation method , 1999, Proceedings 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human and Environment Friendly Robots with High Intelligence and Emotional Quotients (Cat. No.99CH36289).

[20]  David Q. Mayne,et al.  Model predictive control: Recent developments and future promise , 2014, Autom..

[21]  Alberto Rodriguez,et al.  Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Alonzo Kelly,et al.  Receding Horizon Model-Predictive Control for Mobile Robot Navigation of Intricate Paths , 2009, FSR.

[23]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[24]  Éric Marchand,et al.  Photometric visual servoing for omnidirectional cameras , 2013, Autonomous Robots.

[25]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Sergey Levine,et al.  Learning Visual Servoing with Deep Features and Fitted Q-Iteration , 2017, ICLR.

[27]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[28]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[29]  S. Shankar Sastry,et al.  Decentralized nonlinear model predictive control of multiple flying robots , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[30]  William J. Wilson,et al.  Relative end-effector control using Cartesian position based visual servoing , 1996, IEEE Trans. Robotics Autom..

[31]  Dirk P. Kroese,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .

[32]  Sergey Levine,et al.  Stochastic Adversarial Video Prediction , 2018, ArXiv.

[33]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[34]  D. Sherer Fetal grasping at 16 weeks' gestation , 1993, Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine.

[35]  Marko Bacic,et al.  Model predictive control , 2003 .

[36]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Matei T. Ciocarlie,et al.  Data-driven grasping with partial sensor data , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  Abhinav Gupta,et al.  Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  Jürgen Leitner,et al.  Visual Servoing from Deep Neural Networks , 2017, RSS 2017.

[40]  Sergey Levine,et al.  Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.

[41]  Nolan Wagener,et al.  Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[42]  James M. Rehg,et al.  Learning contact locations for pushing and orienting unknown objects , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[43]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[44]  Avinash C. Kak,et al.  Vision for Mobile Robot Navigation: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Patrick Rives,et al.  A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..