Can I Pour Into It? Robot Imagining Open Containability Affordance of Previously Unseen Objects via Physical Simulations

Open containers, i.e., containers without covers, are an important and ubiquitous class of objects in human life. In this letter, we propose a novel method for robots to “imagine” the open containability affordance of a previously unseen object via physical simulations. The robot autonomously scans the object with an RGB-D camera. The scanned 3D model is used for open containability imagination which quantifies the open containability affordance by physically simulating dropping particles onto the object and counting how many particles are retained in it. This quantification is used for open-container vs. non-open-container binary classification (hereafter referred to as open container classification). If the object is classified as an open container, the robot further imagines pouring into the object, again using physical simulations, to obtain the pouring position and orientation for real robot autonomous pouring. We evaluate our method on open container classification and autonomous pouring of granular material on a dataset containing 130 previously unseen objects with 57 object categories. Although our proposed method uses only 11 objects for simulation calibration, its open container classification aligns well with human judgements. In addition, our method endows the robot with the capability to autonomously pour into the 55 containers in the dataset with a very high success rate. We also compare to a deep learning method. Results show that our method achieves the same performance as the deep learning method on open container classification and outperforms it on autonomous pouring. Moreover, our method is fully explainable.

[1]  Nikolaos G. Tsagarakis,et al.  Object-based affordances detection with Convolutional Neural Networks and dense Conditional Random Fields , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Sinisa Todorovic,et al.  A Multi-scale CNN for Affordance Segmentation in RGB Images , 2016, ECCV.

[3]  Dieter Fox,et al.  Manipulator and object tracking for in-hand 3D object modeling , 2011, Int. J. Robotics Res..

[4]  Eduardo Ruiz,et al.  Geometric Affordance Perception: Leveraging Deep 3D Saliency With the Interaction Tensor , 2020, Frontiers in Neurorobotics.

[5]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[6]  Ernest Davis,et al.  Reasoning from Radically Incomplete Information: The Case of Containers , 2013 .

[7]  Eric Lengyel Volumetric Hierarchical Approximate Convex Decomposition , 2016 .

[8]  Dinesh Manocha,et al.  Motion planning for fluid manipulation using simplified dynamics , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Markus Vincze,et al.  AfNet: The Affordance Network , 2012, Asian Conference on Computer Vision.

[10]  Yixin Zhu Visual Commonsense Reasoning: Functionality, Physics, Causality, and Utility , 2018 .

[11]  Tamim Asfour,et al.  Autonomous acquisition of visual multi-view object representations for object recognition on a humanoid robot , 2010, 2010 IEEE International Conference on Robotics and Automation.

[12]  Patricio A. Vela,et al.  Learning Affordance Segmentation for Real-World Robotic Manipulation via Synthetic Images , 2019, IEEE Robotics and Automation Letters.

[13]  Matthias Nießner,et al.  3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kuan-Ting Yu,et al.  Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[16]  Darwin G. Caldwell,et al.  AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Christopher G. Atkeson,et al.  Differential dynamic programming with temporally decomposed dynamics , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[18]  Wei Gao,et al.  kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation , 2019, ISRR.

[19]  L. Uhr,et al.  Representing and using functional definitions for visual recognition , 1987 .

[20]  Sai Kit Yeung,et al.  Fill and Transfer: A Simple Physics-Based Approach for Containability Reasoning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Juergen Gall,et al.  Weakly Supervised Affordance Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yiannis Aloimonos,et al.  Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Edwin Olson,et al.  Predicting object functionality using physical simulations , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Wei Liang,et al.  Tracking Occluded Objects and Recovering Incomplete Trajectories by Reasoning About Containment Relations and Human Actions , 2018, AAAI.

[26]  Li Fei-Fei,et al.  Reasoning about Object Affordances in a Knowledge Base Representation , 2014, ECCV.

[27]  Shimon Ullman,et al.  A model for discovering ‘containment’ relations , 2019, Cognition.

[28]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[29]  Vijay Kumar,et al.  Autonomous Precision Pouring From Unknown Containers , 2019, IEEE Robotics and Automation Letters.

[30]  Wei Liang,et al.  Evaluating Human Cognition of Containing Relations with Physical Simulation , 2015, CogSci.

[31]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[32]  Connor Schenck,et al.  Visual closed-loop control for pouring liquids , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[33]  John J. Leonard,et al.  Toward lifelong object segmentation from change detection in dense RGB-D maps , 2013, 2013 European Conference on Mobile Robots.

[34]  Stefan Boschert,et al.  Digital Twin—The Simulation Aspect , 2016 .

[35]  Michael Beetz,et al.  Envisioning the qualitative effects of robot manipulation actions using simulation-based projections , 2017, Artif. Intell..

[36]  Hongtao Wu,et al.  Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Frank Guerin,et al.  Learning how a tool affords by simulating 3D models from the web , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Michael U. Gutmann,et al.  Adaptable Pouring: Teaching Robots Not to Spill using Fast but Approximate Fluid Simulation , 2017, CoRL.