Using dVRK teleoperation to facilitate deep learning of automation tasks for an industrial robot

Deep Learning from Demonstrations (Deep LfD) is a promising approach for robots to perform precise bilateral automation tasks involving contact and deformation, where dynamics are difficult to model explicitly. Deep LfD methods typically use datasets of 1) human videos, which do not match robot kinematics and capabilities or 2) waypoints collected with tedious move-and-record interfaces, such as teaching pendants or kinesthetic teaching. We explore an alternative using the Intuitive Surgical da Vinci, which combines a pair of gravity-balanced, high-precision, passive, and 6-DOF master arms with stereo vision, allowing humans to teleoperate precise surgical automation tasks. We present DY-Teleop, an interface between the da Vinci master manipulators and an ABB YuMi industrial robot to facilitate the collection of time-synchronized images and robot states for deep learning of automation tasks involving deformation and dynamic contact. The system has an average latency of 194ms and executes commands at 6Hz. We present YuMiPy, an open source library and ROS package for controlling an ABB YuMi over Ethernet. Data collection experiments with scooping a ball into a cup, untying a knot in a rope, and pipetting liquid between two containers suggest that demonstrations obtained by DY-Teleop are comparable with those by kinesthetic teaching in demonstration time. We performed Deep LfD for the scooping task and found that the policy trained with DY-Teleop achieved a 1.8× higher success rate than a policy trained with kinesthetic teaching. Code, videos, and data are available at berkeleyautomation.github.io/teleop.

[1]  Yiannis Demiris,et al.  A morphable template framework for robot learning by demonstration: Integrating one-shot and incremental learning approaches , 2014, Robotics Auton. Syst..

[2]  J. Andrew Bagnell,et al.  Efficient Reductions for Imitation Learning , 2010, AISTATS.

[3]  Maxim Likhachev,et al.  Learning to plan for constrained manipulation from demonstrations , 2016, Auton. Robots.

[4]  Brian D. Ziebart,et al.  Goal-predictive robotic teleoperation from noisy sensors , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Mark W. Spong,et al.  Bilateral teleoperation: An historical survey , 2006, Autom..

[6]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[7]  Trevor Darrell,et al.  TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Tamim Asfour,et al.  Imitation Learning of Dual-Arm Manipulation Tasks in Humanoid Robots , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[9]  Anca D. Dragan,et al.  Robot grasping in clutter: Using a hierarchy of supervisors for learning from demonstrations , 2016, 2016 IEEE International Conference on Automation Science and Engineering (CASE).

[10]  A. Billard,et al.  Teaching a Humanoid : A User Study with HOAP-3 on Learning by Demonstration , .

[11]  William K. Durfee,et al.  IEEE/RSJ/GI International Conference on Intelligent Robots and Systems , 1994 .

[12]  Dongheui Lee,et al.  Incremental kinesthetic teaching of motion primitives using the motion refinement tube , 2011, Auton. Robots.

[13]  Darwin G. Caldwell,et al.  Upper-body kinesthetic teaching of a free-standing humanoid robot , 2011, 2011 IEEE International Conference on Robotics and Automation.

[14]  Sham M. Kakade,et al.  On the sample complexity of reinforcement learning. , 2003 .

[15]  Robert W. Linderman,et al.  A Survey of User Interfaces for Robot Teleoperation , 2009 .

[16]  Michael Laskey,et al.  An algorithm and user study for teaching bilateral manipulation via iterated best response demonstrations , 2017, 2017 13th IEEE Conference on Automation Science and Engineering (CASE).

[17]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[18]  Scott Niekum,et al.  Learning grounded finite-state representations from unstructured demonstrations , 2015, Int. J. Robotics Res..

[19]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[20]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[21]  Jochen J. Steil,et al.  A user study on kinesthetic teaching of redundant robots in task and configuration space , 2013, HRI 2013.

[22]  Yang Liu,et al.  Vision-based predictive assist control on master-slave systems , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Tomás Lozano-Pérez,et al.  Imitation Learning of Whole-Body Grasps , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  S Stefan Lichiardopol,et al.  A Survey on Teleoperation , 2007 .

[25]  Michael Laskey,et al.  Statistical data cleaning for deep learning of automation tasks from demonstrations , 2017, 2017 13th IEEE Conference on Automation Science and Engineering (CASE).

[26]  Cyrill Stachniss,et al.  Learning manipulation actions from a few demonstrations , 2013, 2013 IEEE International Conference on Robotics and Automation.

[27]  Kenneth Y. Goldberg,et al.  Design of parallel-jaw gripper tip surfaces for robust grasping , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Daniel Kubus,et al.  Kinesthetic Teaching in Assembly Operations - A User Study , 2014, SIMPAR.

[29]  Andrea Lockerd Thomaz,et al.  Novel Interaction Strategies for Learning from Teleoperation , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[30]  Chun-Yi Su,et al.  Brain–Machine Interface and Visual Compressive Sensing-Based Teleoperation Control of an Exoskeleton Robot , 2017, IEEE Transactions on Fuzzy Systems.

[31]  Aude Billard,et al.  Incremental motion learning with locally modulated dynamical systems , 2015, Robotics Auton. Syst..

[32]  Thomas B. Sheridan,et al.  Telerobotics, Automation, and Human Supervisory Control , 2003 .

[33]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[34]  Tomoichi Takahashi,et al.  Robotic assembly operation based on task-level teaching in virtual reality , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[35]  Aude Billard,et al.  Learning robotic eye–arm–hand coordination from human demonstration: a coupled dynamical systems approach , 2014, Biological Cybernetics.

[36]  K. Subramanian,et al.  Robot Learning from Demonstration : Kinesthetic Teaching vs . Teleoperation , 2011 .

[37]  Anca D. Dragan,et al.  SHIV: Reducing supervisor burden in DAgger using support vectors for efficient learning from demonstrations in high dimensional state spaces , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Allison M. Okamura,et al.  Methods for haptic feedback in teleoperated robot-assisted surgery , 2004 .

[39]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[40]  Anca D. Dragan,et al.  Comparing human-centric and robot-centric sampling for robot deep learning from demonstrations , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[41]  Martial Hebert,et al.  Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.

[42]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[43]  Maya Cakmak,et al.  Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[44]  Aude Billard,et al.  Learning Bimanual Coordinated Tasks From Human Demonstrations , 2015, HRI.

[45]  Neil A. Dodgson,et al.  Variation and extrema of human interpupillary distance , 2004, IS&T/SPIE Electronic Imaging.

[46]  Aude Billard,et al.  Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space , 2009, Adv. Robotics.

[47]  Pieter Abbeel,et al.  Learning from Demonstrations Through the Use of Non-rigid Registration , 2013, ISRR.

[48]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[49]  Gregory D. Hager,et al.  Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.