Helping Robots Learn: A Human-Robot Master-Apprentice Model Using Demonstrations via Virtual Reality Teleoperation

As artificial intelligence becomes an increasingly prevalent method of enhancing robotic capabilities, it is important to consider effective ways to train these learning pipelines and to leverage human expertise. Working towards these goals, a master-apprentice model is presented and is evaluated during a grasping task for effectiveness and human perception. The apprenticeship model augments self-supervised learning with learning by demonstration, efficiently using the human’s time and expertise while facilitating future scalability to supervision of multiple robots; the human provides demonstrations via virtual reality when the robot cannot complete the task autonomously. Experimental results indicate that the robot learns a grasping task with the apprenticeship model faster than with a solely self-supervised approach and with fewer human interventions than a solely demonstration-based approach; 100% grasping success is obtained after 150 grasps with 19 demonstrations. Preliminary user studies evaluating workload, usability, and effectiveness of the system yield promising results for system scalability and deployability. They also suggest a tendency for users to overestimate the robot’s skill and to generalize its capabilities, especially as learning improves.

[1]  J. A. Adams Multiple robot / single human interaction: effects on perceived workload , 2009, Behav. Inf. Technol..

[2]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[3]  Nobuto Matsuhira,et al.  Development of a multi-telerobot system for remote collaboration , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[4]  脇元 修一,et al.  IEEE International Conference on Robotics and Automation (ICRA) におけるフルードパワー技術の研究動向 , 2011 .

[5]  Ken Goldberg,et al.  Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[6]  Jeannette Bohg,et al.  Leveraging big data for grasp planning , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Tao Chen,et al.  Competitive Multi-robot Teleoperation , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[8]  Julie A. Shah,et al.  C-LEARN: Learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[10]  Imad H. Elhajj,et al.  Design and Analysis of Internet-Based Tele-Coordinated Multi-Robot Systems , 2003, Auton. Robots.

[11]  Kenichi Fujino,et al.  Research on Improving Work Efficiency of Unmanned Construction , 2016 .

[12]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Stefan Schaal,et al.  Learning force control policies for compliant manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Scott Niekum,et al.  Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Maxim Likhachev,et al.  Learning to plan for constrained manipulation from demonstrations , 2016, Auton. Robots.

[16]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[17]  Christopher D. Wickens,et al.  The benefits of imperfect diagnostic automation: a synthesis of the literature , 2007 .

[18]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[20]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[21]  Stefano Caselli,et al.  Robust trajectory learning and approximation for robot programming by demonstration , 2006, Robotics Auton. Syst..

[22]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Stefan Leutenegger,et al.  Deep learning a grasp function for grasping under gripper pose uncertainty , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Alan R. Wagner,et al.  Overtrust of robots in emergency evacuation scenarios , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[25]  Jessie Y. C. Chen,et al.  Human Performance Issues and User Interface Design for Teleoperated Robots , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[26]  Sven Behnke,et al.  Supervised Autonomy for Exploration and Mobile Manipulation in Rough Terrain with a Centaur-Like Robot , 2016, Front. Robot. AI.

[27]  Anis Sahbani,et al.  An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..

[28]  Raja Parasuraman,et al.  Adaptive Aiding of Human-Robot Teaming , 2011 .

[29]  Daniela Rus,et al.  Baxter's Homunculus: Virtual Reality Spaces for Teleoperation in Manufacturing , 2017, IEEE Robotics and Automation Letters.

[30]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[31]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[33]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Shigeki Sugano,et al.  Repeatable Folding Task by Humanoid Robot Worker Using Deep Learning , 2017, IEEE Robotics and Automation Letters.

[35]  Martin Buss,et al.  Model-Mediated Teleoperation for multi-operator multi-robot systems , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Daniela Rus,et al.  Learning Object Grasping for Soft Robot Hands , 2018, IEEE Robotics and Automation Letters.

[37]  Andrew T. Irish,et al.  Trajectory Learning for Robot Programming by Demonstration Using Hidden Markov Model and Dynamic Time Warping , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Pamela J. Hinds,et al.  Autonomy and Common Ground in Human-Robot Interaction: A Field Study , 2007, IEEE Intelligent Systems.

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[41]  Michael A. Goodrich,et al.  Characterizing efficiency of human robot interaction: a case study of shared-control teleoperation , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Yoshihiko Nakamura,et al.  Learning Robot Skills Through Motion Segmentation and Constraints Extraction , 2013 .

[43]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[44]  Yiannis Demiris,et al.  Towards One Shot Learning by imitation for humanoid robots , 2010, 2010 IEEE International Conference on Robotics and Automation.

[45]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[46]  E. AmirM.Ghalamzan,et al.  Robot learning from demonstrations: Emulation learning in environments with moving obstacles , 2018, Robotics Auton. Syst..

[47]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[48]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[49]  Holly A. Yanco,et al.  Robot confidence and trust alignment , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[50]  Michael Laskey,et al.  Using dVRK teleoperation to facilitate deep learning of automation tasks for an industrial robot , 2017, 2017 13th IEEE Conference on Automation Science and Engineering (CASE).