Language-Guided Adaptive Perception for Efficient Grounded Communication with Robotic Manipulators in Cluttered Environments

The utility of collaborative manipulators for shared tasks is highly dependent on the speed and accuracy of communication between the human and the robot. The run-time of recently developed probabilistic inference models for situated symbol grounding of natural language instructions depends on the complexity of the representation of the environment in which they reason. As we move towards more complex bi-directional interactions, tasks, and environments, we need intelligent perception models that can selectively infer precise pose, semantics, and affordances of the objects when inferring exhaustively detailed world models is inefficient and prohibits real-time interaction with these robots. In this paper we propose a model of language and perception for the problem of adapting the configuration of the robot perception pipeline for tasks where constructing exhaustively detailed models of the environment is inefficient and inconsequential for symbol grounding. We present experimental results from a synthetic corpus of natural language instructions for robot manipulation in example environments. The results demonstrate that by adapting perception we get significant gains in terms of run-time for perception and situated symbol grounding of the language instructions without a loss in the accuracy of the latter.

[1]  P. Abbeel,et al.  Yale-CMU-Berkeley dataset for robotic manipulation research , 2017, Int. J. Robotics Res..

[2]  Patric Jensfelt,et al.  Large-scale semantic mapping and reasoning with heterogeneous modalities , 2012, 2012 IEEE International Conference on Robotics and Automation.

[3]  Cipriano Galindo,et al.  Multi-hierarchical semantic maps for mobile robotics , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Nicholas Roy,et al.  Efficient Grounding of Abstract Spatial Concepts for Natural Language Interaction with Robot Manipulators , 2016, Robotics: Science and Systems.

[5]  Matthew R. Walter,et al.  Learning models for following natural language directions in unknown environments , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Henrik I. Christensen,et al.  Automatic grasp planning using shape primitives , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[7]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[8]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[9]  Stefanie Tellex,et al.  A natural language planner interface for mobile manipulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Matthew R. Walter,et al.  A Multiview Approach to Learning Articulated Motion Models , 2017, ISRR.

[11]  Maren Bennewitz,et al.  Humanoid robot localization in complex indoor environments , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Matthew R. Walter,et al.  On the performance of hierarchical distributed correspondence graphs for efficient symbol grounding of robot instructions , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Viii Supervisor Sonar-Based Real-World Mapping and Navigation , 2001 .

[14]  Oliver Brock,et al.  Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems , 2016, Robotics: Science and Systems.

[15]  Larry H. Matthies,et al.  End-to-end dexterous manipulation with deliberate interactive estimation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[16]  Luke S. Zettlemoyer,et al.  Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[17]  Trevor Darrell,et al.  Segmentation from Natural Language Expressions , 2016, ECCV.

[18]  Danica Kragic,et al.  Selection of robot pre-grasps using box-based shape approximation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[20]  Jean Oh,et al.  Grounding spatial relations for outdoor robot navigation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Stevan Harnad The Symbol Grounding Problem , 1999, ArXiv.

[22]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[23]  Jean Oh,et al.  Inferring Maps and Behaviors from Natural Language Instructions , 2015, ISER.