Shape-aware spatio-temporal descriptors for interaction classification

Many real-world tasks for autonomous agents benefit from understanding dynamic inter-object interactions. Detecting, analyzing and differentiating between the various ways that an object can be interacted with provides implicit information about its function. This can help train autonomous agents to handle objects and understand unknown scenes. We describe a general mathematical framework to analyze and classify interactions, defined as dynamic motions performed by an active object onto a passive one. We factorize interactions via motion features computed in the spatio-temporal domain, and encoded into a global, object-centric signature. Equipped with a similarity measure to compare such signatures, we showcase classification of interactions with a single object. We also propose a novel acquisition setup combining RGBD sensing with a virtual reality (VR) display, to capture interactions with purely virtual objects.

[1]  Francisco José Madrid-Cuevas,et al.  Generation of fiducial marker dictionaries using Mixed Integer Linear Programming , 2016, Pattern Recognit..

[2]  Dov Katz Jacqueline Kenney Oliver Brock How Can Robots Succeed in Unstructured Environments ? , 2008 .

[3]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[4]  Gary M. Bone,et al.  Automated modeling and robotic grasping of unknown three-dimensional objects , 2008, 2008 IEEE International Conference on Robotics and Automation.

[5]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Alex Waibel,et al.  Readings in speech recognition , 1990 .

[7]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[9]  Robert Platt,et al.  Localizing Handle-Like Grasp Affordances in 3D Point Clouds , 2014, ISER.

[10]  Sergey Levine,et al.  Learning dexterous manipulation for a soft robotic hand from human demonstrations , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[12]  G. Burton TOPICS IN OPTIMAL TRANSPORTATION (Graduate Studies in Mathematics 58) By CÉDRIC VILLANI: 370 pp., US$59.00, ISBN 0-8218-3312-X (American Mathematical Society, Providence, RI, 2003) , 2004 .

[13]  Andrea Tagliasacchi,et al.  Robust Articulated-ICP for Real-Time Hand Tracking , 2015 .

[14]  Ariel Shamir,et al.  Learning how objects function via co-analysis of interactions , 2016, ACM Trans. Graph..

[15]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[16]  Wei-Shi Zheng,et al.  Jointly Learning Heterogeneous Features for RGB-D Activity Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Vijay Kumar,et al.  Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[18]  Stefano Caselli,et al.  Perception and Grasping of Object Parts from Active Robot Exploration , 2014, J. Intell. Robotic Syst..

[19]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[20]  Daniel Cremers,et al.  A primal-dual framework for real-time dense RGB-D scene flow , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[22]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[23]  Micha Sharir,et al.  A Survey of Motion Planning and Related Geometric Algorithms , 1988, Artificial Intelligence.

[24]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Dieter Fox,et al.  DART: Dense Articulated Real-Time Tracking , 2014, Robotics: Science and Systems.

[26]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.