PATO: Policy Assisted TeleOperation for Scalable Robot Data Collection

Large-scale data is an essential component of machine learning as demonstrated in recent advances in natural language processing and computer vision research. However, collecting large-scale robotic data is much more expensive and slower as each operator can control only a single robot at a time. To make this costly data collection process efficient and scalable, we propose Policy Assisted TeleOperation (PATO), a system which automates part of the demonstration collection process using a learned assistive policy. PATO autonomously executes repetitive behaviors in data collection and asks for human input only when it is uncertain about which subtask or behavior to execute. We conduct teleoperation user studies both with a real robot and a simulated robot fleet and demonstrate that our assisted teleoperation system reduces human operators' mental load while improving data collection efficiency. Further, it enables a single operator to control multiple robots in parallel, which is a first step towards scalable robotic data collection. For code and video results, see https://clvrai.com/pato

[1]  Pannag R. Sanketi,et al.  RT-1: Robotics Transformer for Real-World Control at Scale , 2022, Robotics: Science and Systems.

[2]  Matthew C. Fontaine,et al.  Evaluating Human–Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario Generation , 2022, ACM Trans. Hum. Robot Interact..

[3]  Vikash Kumar,et al.  R3M: A Universal Visual Representation for Robot Manipulation , 2022, CoRL.

[4]  Maya Cakmak,et al.  Assistive Tele-op: Leveraging Transformers to Collect Robotic Task Demonstrations , 2021, ArXiv.

[5]  Sergey Levine,et al.  AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale , 2021, CoRL.

[6]  B. Siciliano,et al.  Autonomy in Physical Human-Robot Interaction: A Brief Survey , 2021, IEEE Robotics and Automation Letters.

[7]  S. Levine,et al.  Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets , 2021, Robotics: Science and Systems.

[8]  Ashwin Balakrishna,et al.  ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning , 2021, CoRL.

[9]  Silvio Savarese,et al.  What Matters in Learning from Offline Human Demonstrations for Robot Manipulation , 2021, CoRL.

[10]  Marc Toussaint,et al.  A System for Traded Control Teleoperation of Manipulation Tasks using Intent Prediction from Hand Gestures , 2021, 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN).

[11]  Brijen Thananjeyan,et al.  LazyDAgger: Reducing Context Switching in Interactive Imitation Learning , 2021, 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE).

[12]  Matthew C. Fontaine,et al.  A Quality Diversity Approach to Automatically Generating Human-Robot Interaction Scenarios in Shared Autonomy , 2020, Robotics: Science and Systems.

[13]  Dorsa Sadigh,et al.  Shared Autonomy with Learned Latent Actions , 2020, Robotics: Science and Systems.

[14]  Li Fei-Fei,et al.  Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations , 2020, Robotics: Science and Systems.

[15]  Michael Johnson,et al.  Four Years in Review: Statistical Practices of Likert Scales in Human-Robot Interaction Studies , 2020, HRI.

[16]  Joseph J. Lim,et al.  IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks , 2019, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[17]  D. Fox,et al.  IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Sergey Levine,et al.  Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning , 2019, CoRL.

[19]  Oleg O. Sushkov,et al.  Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.

[20]  Dylan P. Losey,et al.  Controlling Assistive Robots with Learned Latent Actions , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[21]  S. Levine,et al.  Learning Latent Plans from Play , 2019, CoRL.

[22]  Li Fei-Fei,et al.  ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation , 2018, CoRL.

[23]  Katherine Rose Driggs-Campbell,et al.  HG-DAgger: Interactive Imitation Learning with Human Experts , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[24]  Katherine Rose Driggs-Campbell,et al.  EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning , 2018, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Brenna D. Argall,et al.  Autonomy in Rehabilitation Robotics: An Intersection , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[26]  Sanja Fidler,et al.  Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Ken Goldberg,et al.  Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[28]  Steve Chien,et al.  Review on space robotics: Toward top-level science through space exploration , 2017, Science Robotics.

[29]  Sanja Fidler,et al.  Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Siddhartha S. Srinivasa,et al.  Human-Robot Mutual Adaptation in Shared Autonomy , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[31]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[32]  Kyunghyun Cho,et al.  Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.

[33]  Guang-Zhong Yang,et al.  Hubot: A three state Human-Robot collaborative framework for bimanual surgical tasks based on learned models , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Wendy Ju,et al.  Exploring shared control in automated driving , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[35]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[36]  Siddhartha S. Srinivasa,et al.  Shared Autonomy via Hindsight Optimization , 2015, Robotics: Science and Systems.

[37]  Dmitry Berenson,et al.  From Autonomy to Cooperative Traded Control of Humanoid Manipulation Tasks with Unreliable Communication , 2014, Journal of Intelligent & Robotic Systems.

[38]  Dawn M. Tilbury,et al.  Blending of human and obstacle avoidance control for a high speed mobile robot , 2014, 2014 American Control Conference.

[39]  Sandra G. Hart,et al.  NASA Task Load Index (TLX) , 2013 .

[40]  Siddhartha S. Srinivasa,et al.  A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[41]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[43]  Heni Ben Amor,et al.  Kinesthetic Bootstrapping: Teaching Motor Skills to Humanoid Robots through Physical Interaction , 2009, KI.

[44]  Michael A. Goodrich,et al.  Characterizing efficiency of human robot interaction: a case study of shared-control teleoperation , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[45]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[46]  M. Veloso,et al.  Robotics and autonomous systems , 1988, Robotics Auton. Syst..

[47]  Michael Hofmann,et al.  International Conference on Flexible Automation and Intelligent Manufacturing , FAIM 2017 , 27-30 June 2017 , Modena , Italy Towards shared autonomy for robotic tasks in manufacturing , 2018 .

[48]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[49]  Dan Ryan,et al.  Traded Control with Autonomous Robots as Mixed Initiative Interaction , 1997 .

[50]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .