Geometric Affordance Perception: Leveraging Deep 3D Saliency With the Interaction Tensor

Agents that need to act on their surroundings can significantly benefit from the perception of their interaction possibilities or affordances. In this paper we combine the benefits of the Interaction Tensor, a straight-forward geometrical representation that captures multiple object-scene interactions, with deep learning saliency for fast parsing of affordances in the environment. Our approach works with visually perceived 3D pointclouds and enables to query a 3D scene for locations that support affordances such as sitting or riding, as well as interactions for everyday objects like the where to hang an umbrella or place a mug. Crucially, the nature of the interaction description exhibits one-shot generalization. Experiments with numerous synthetic and real RGB-D scenes and validated by human subjects, show that the representation enables the prediction of affordance candidate locations in novel environments from a single training example. The approach also allows for a highly parallelizable, multiple-affordance representation, and works at fast rates. The combination of the deep neural network that learns to estimate scene saliency with the one-shot geometric representation aligns well with the expectation that computational models for affordance estimation should be perceptually direct and economical.

[1]  Manuel Lopes,et al.  Modeling affordances using Bayesian networks , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[3]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[4]  Francesc Moreno-Noguer,et al.  Learning Depth-Aware Deep Representations for Robotic Perception , 2017, IEEE Robotics and Automation Letters.

[5]  Darius Burschka,et al.  Predicting human intention in visual observations of hand/object interactions , 2013, 2013 IEEE International Conference on Robotics and Automation.

[6]  Oliver Kroemer,et al.  A kernel-based approach to direct action perception , 2012, 2012 IEEE International Conference on Robotics and Automation.

[7]  Edwin Olson,et al.  Predicting object functionality using physical simulations , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  G. Metta,et al.  Exploring affordances and tool use on the iCub , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[9]  Giorgio Metta,et al.  Self-supervised learning of tool affordances from 3D tool representation through parallel SOM mapping , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Yiannis Aloimonos,et al.  Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Liming Wang,et al.  CNN-MonoFusion: Online Monocular Dense Reconstruction Using Learned Depth from Single View , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct).

[12]  Yiannis Aloimonos,et al.  What can i do around here? Deep functional scene understanding for cognitive robots , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Danica Kragic,et al.  Generalizing grasps across partly similar objects , 2012, 2012 IEEE International Conference on Robotics and Automation.

[14]  Masayuki Inaba,et al.  Determining proper grasp configurations for handovers through observation of object movement patterns and inter-object interactions during usage , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Eren Erdal Aksoy,et al.  Validation of whole-body loco-manipulation affordances for pushability and liftability , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[16]  Brian Wyvill,et al.  Robust iso-surface tracking for interactive character skinning , 2014, ACM Trans. Graph..

[17]  Sai Kit Yeung,et al.  Fill and Transfer: A Simple Physics-Based Approach for Containability Reasoning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Sertac Karaman,et al.  Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Robert Platt,et al.  Localizing Handle-Like Grasp Affordances in 3D Point Clouds , 2014, ISER.

[20]  Danica Kragic,et al.  Learning a dictionary of prototypical grasp-predicting parts from grasping experience , 2013, 2013 IEEE International Conference on Robotics and Automation.

[21]  Siddhartha S. Srinivasa,et al.  Benchmarking in Manipulation Research: Using the Yale-CMU-Berkeley Object and Model Set , 2015, IEEE Robotics & Automation Magazine.

[22]  Atabak Dehban,et al.  Denoising auto-encoders for learning of objects and tools affordances in continuous space , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23]  ZhaoXi,et al.  Relationship templates for creating scene variations , 2016 .

[24]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[25]  Leonidas J. Guibas,et al.  Understanding and Exploiting Object Interaction Landscapes , 2016, ACM Trans. Graph..

[26]  Ligang Liu,et al.  Interaction context (ICON) , 2015, ACM Trans. Graph..

[27]  Plinio Moreno,et al.  On the use of probabilistic relational affordance models for sequential manipulation tasks in robotics , 2013, 2013 IEEE International Conference on Robotics and Automation.

[28]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[29]  Frank Guerin,et al.  Learning how a tool affords by simulating 3D models from the web , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Yukie Nagai,et al.  Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese , 2015, IEEE Transactions on Autonomous Mental Development.

[31]  J. Sinapov,et al.  Learning and generalization of behavior-grounded tool affordances , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[32]  Leslie Pack Kaelbling,et al.  Ecological Robotics , 1998, Adapt. Behav..

[33]  Walterio W. Mayol-Cuevas,et al.  Where can i do this? Geometric Affordances from a Single Example with the Interaction Tensor , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[35]  Justus H. Piater,et al.  Towards affordance detection for robot manipulation using affordance for parts and parts for affordance , 2018, Auton. Robots.

[36]  Alexandre Bernardino,et al.  Learning at the ends: From hand to tool affordances in humanoid robots , 2017, 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[37]  R. Amant,et al.  Affordances for robots: a brief survey , 2012 .

[38]  Martin Peternell Geometric Properties of Bisector Surfaces , 2000, Graph. Model..

[39]  Gaurav S. Sukhatme,et al.  Semantic labeling of 3D point clouds with object affordance for robot manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Tamim Asfour,et al.  Affordance-Based Multi-Contact Whole-Body Pose Sequence Planning for Humanoid Robots in Unknown Environments , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[41]  Robert Platt,et al.  Using Geometry to Detect Grasp Poses in 3D Point Clouds , 2015, ISRR.

[42]  Darwin G. Caldwell,et al.  AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Michael S. Ryoo,et al.  Learning social affordance grammar from videos: Transferring human interactions to human-robot interactions , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Eren Erdal Aksoy,et al.  Towards a hierarchy of loco-manipulation affordances , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[45]  J. Andrew Bagnell,et al.  Perceiving, learning, and exploiting object affordances for autonomous pile manipulation , 2013, Auton. Robots.

[46]  Nikolaos G. Tsagarakis,et al.  Detecting object affordances with Convolutional Neural Networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47]  Luc De Raedt,et al.  Learning relational affordance models for two-arm robots , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[48]  Yun Jiang,et al.  Modeling High-Dimensional Humans for Activity Anticipation using Gaussian Process Latent CRFs , 2014, Robotics: Science and Systems.

[49]  Tamim Asfour,et al.  Extracting whole-body affordances from multimodal exploration , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[50]  Alexandre Bernardino,et al.  Robot anticipation of human intentions through continuous gesture recognition , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[51]  Justus H. Piater,et al.  Bottom-up learning of object categories, action effects and logical rules: From continuous manipulative exploration to symbolic planning , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[52]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[53]  Nikolaos G. Tsagarakis,et al.  Object-based affordances detection with Convolutional Neural Networks and dense Conditional Random Fields , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[54]  Taku Komura,et al.  Character‐Object Interaction Retrieval using the Interaction Bisector Surface , 2017, Comput. Graph. Forum.

[55]  A. Cangelosi,et al.  Developmental Robotics: From Babies to Robots , 2015 .

[56]  Jin-Hui Zhu,et al.  Affordance Research in Developmental Robotics: A Survey , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[57]  Alexandre Bernardino,et al.  From human instructions to robot actions: Formulation of goals, affordances and probabilistic planning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[58]  Justus H. Piater,et al.  Emergent structuring of interdependent affordance learning tasks , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[59]  Taku Komura,et al.  Relationship templates for creating scene variations , 2016, ACM Trans. Graph..

[60]  Emre Ugur,et al.  Emergent Structuring of Interdependent Affordance Learning Tasks Using Intrinsic Motivation and Empirical Feature Selection , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[61]  Luc De Raedt,et al.  Occluded object search by relational affordances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Dana H. Ballard,et al.  Principles of animate vision , 1992, CVGIP Image Underst..

[63]  Danica Kragic,et al.  Task-Based Robot Grasp Planning Using Probabilistic Inference , 2015, IEEE Transactions on Robotics.

[64]  Juergen Gall,et al.  Weakly Supervised Affordance Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Walterio Mayol-Cuevas,et al.  Scalable Real-Time and One-Shot Multiple-Affordance Detection , 2019 .

[66]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Justus H. Piater,et al.  Bootstrapping paired-object affordance learning with learned single-affordance features , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[68]  Taku Komura,et al.  Indexing 3D Scenes Using the Interaction Bisector Surface , 2014, ACM Trans. Graph..

[69]  Justus H. Piater,et al.  Computational models of affordance in robotics: a taxonomy and systematic classification , 2017, Adapt. Behav..

[70]  Giorgio Metta,et al.  Self-supervised learning of grasp dependent tool affordances on the iCub Humanoid robot , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[71]  Angelo Cangelosi,et al.  Affordances in Psychology, Neuroscience, and Robotics: A Survey , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[72]  Luc De Raedt,et al.  Learning relational affordance models for robots in multi-object manipulation tasks , 2012, 2012 IEEE International Conference on Robotics and Automation.

[73]  N. Kruger,et al.  Learning object-specific grasp affordance densities , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[74]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Alexandre Bernardino,et al.  Learning intermediate object affordances: Towards the development of a tool concept , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[76]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[77]  Markus Vincze,et al.  Supervised learning of hidden and non-hidden 0-order affordances and detection in real scenes , 2012, 2012 IEEE International Conference on Robotics and Automation.

[78]  Masaki Ogino,et al.  Cognitive Developmental Robotics: A Survey , 2009, IEEE Transactions on Autonomous Mental Development.