论文信息 - No, to the Right: Online Language Corrections for Robotic Manipulation via Shared Autonomy

No, to the Right: Online Language Corrections for Robotic Manipulation via Shared Autonomy

Systems for language-guided human-robot interaction must satisfy two key desiderata for broad adoption: adaptivity and learning efficiency. Unfortunately, existing instruction-following agents cannot adapt, lacking the ability to incorporate online natural language supervision, and even if they could, require hundreds of demonstrations to learn even simple policies. In this work, we address these problems by presenting Language-Informed Latent Actions with Corrections (LILAC), a framework for incorporating and adapting to natural language corrections "to the right", or "no, towards the book" - online, during execution. We explore rich manipulation domains within a shared autonomy paradigm. Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot. Each real-time correction refines the human's control space, enabling precise, extended behaviors - with the added benefit of requiring only a handful of demonstrations to learn. We evaluate our approach via a user study where users work with a Franka Emika Panda manipulator to complete complex manipulation tasks. Compared to existing learned baselines covering both open-loop instruction following and single-turn shared autonomy, we show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users because of its reliability, precision, and ease of use.

[1] Luis F. C. Figueredo,et al. LATTE: LAnguage Trajectory TransformEr , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[2] Peter R. Florence,et al. Inner Monologue: Embodied Reasoning through Planning with Language Models , 2022, CoRL.

[3] Oier Mees,et al. What Matters in Language Conditioned Robotic Imitation Learning Over Unstructured Data , 2022, IEEE Robotics and Automation Letters.

[4] D. Fox,et al. Correcting Robot Plans with Natural Language Feedback , 2022, Robotics: Science and Systems.

[5] Luis F. C. Figueredo,et al. Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6] Jeannette Bohg,et al. Learning latent actions to control assistive robots , 2021, Autonomous Robots.

[7] Sergey Levine,et al. BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning , 2022, CoRL.

[8] Dorsa Sadigh,et al. LILA: Language-Informed Latent Actions , 2021, CoRL.

[9] Dieter Fox,et al. CLIPort: What and Where Pathways for Robotic Manipulation , 2021, CoRL.

[10] Ashwin Balakrishna,et al. ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning , 2021, CoRL.

[11] Dorsa Sadigh,et al. Learning Visually Guided Latent Actions for Assistive Teleoperation , 2021, L4DC.

[12] Siddhartha S. Srinivasa,et al. Learning Online from Corrective Feedback: A Meta-Algorithm for Robotics , 2021, ArXiv.

[13] Dorsa Sadigh,et al. Learning Human Objectives from Sequences of Physical Corrections , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[14] Silvio Savarese,et al. Human-in-the-Loop Imitation Learning using Remote Teleoperation , 2020, ArXiv.

[15] Chitta Baral,et al. Language-Conditioned Imitation Learning for Robot Manipulation Tasks , 2020, NeurIPS.

[16] Dorsa Sadigh,et al. Learning Adaptive Language Interfaces through Decomposition , 2020, INTEXSEMPAR.

[17] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[18] Pierre Sermanet,et al. Grounding Language in Play , 2020, ArXiv.

[19] Dorsa Sadigh,et al. Shared Autonomy with Learned Latent Actions , 2020, Robotics: Science and Systems.

[20] Jacob Andreas,et al. Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data , 2020, ArXiv.

[21] Dylan P. Losey,et al. Controlling Assistive Robots with Learned Latent Actions , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[22] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[23] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[24] Jesse Thomason,et al. Vision-and-Dialog Navigation , 2019, CoRL.

[25] Shimon Whiteson,et al. A Survey of Reinforcement Learning Informed by Natural Language , 2019, IJCAI.

[26] Katherine Rose Driggs-Campbell,et al. HG-DAgger: Interactive Imitation Learning with Human Experts , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[27] John DeNero,et al. Guiding Policies with Language via Meta-Learning , 2018, ICLR.

[28] Thien Huu Nguyen,et al. BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning , 2018, ICLR.

[29] Peter Stone,et al. Improving Grounded Natural Language Understanding through Human-Robot Dialog , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[30] Dorsa Sadigh,et al. Batch Active Preference-Based Learning of Reward Functions , 2018, CoRL.

[31] Brenna D. Argall,et al. Autonomy in Rehabilitation Robotics: An Intersection , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[32] Yuchen Cui,et al. Active Reward Learning from Critiques , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[34] Siddhartha S. Srinivasa,et al. Shared autonomy via hindsight optimization for teleoperation and teaming , 2017, Int. J. Robotics Res..

[35] Edoardo Battaglia,et al. A Review of Intent Detection, Arbitration, and Communication Aspects of Shared Control for Physical Human-Robot Interaction , 2018 .

[36] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[37] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[38] Brenna Argall,et al. Real-time natural language corrections for assistive robotic manipulators , 2017, Int. J. Robotics Res..

[39] Stefanie Tellex,et al. Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities , 2017, Robotics: Science and Systems.

[40] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .

[41] Siddhartha S. Srinivasa,et al. Assistive teleoperation of robot arms via automatic time-optimal mode switching , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[42] Peter Stone,et al. Learning to Interpret Natural Language Commands through Human-Robot Dialog , 2015, IJCAI.

[43] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[44] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45] Andrea Lockerd Thomaz,et al. Robot Learning from Human Teachers , 2014, Robot Learning from Human Teachers.

[46] Jayant Krishnamurthy,et al. Toward Interactive Grounded Language Acqusition , 2013, Robotics: Science and Systems.

[47] Siddhartha S. Srinivasa,et al. A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[48] Luke S. Zettlemoyer,et al. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[49] Luke S. Zettlemoyer,et al. A Joint Model of Language and Perception for Grounded Attribute Learning , 2012, ICML.

[50] Matthew R. Walter,et al. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[51] Nathan D. Ratliff,et al. Towards Real-Time Natural Language Corrections for Assistive Robots , 2010 .