ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging

Grasping and manipulating objects is an important human skill. Since hand-object contact is fundamental to grasping, capturing it can lead to important insights. However, observing contact through external sensors is challenging because of occlusion and the complexity of the human hand. We present ContactDB, a novel dataset of contact maps for household objects that captures the rich hand-object contact that occurs during grasping, enabled by use of a thermal camera. Participants in our study grasped 3D printed objects with a post-grasp functional intent. ContactDB includes 3750 3D meshes of 50 household objects textured with contact maps and 375K frames of synchronized RGB-D+thermal images. To the best of our knowledge, this is the first large-scale dataset that records detailed contact maps for human grasps. Analysis of this data shows the influence of functional intent and object size on grasping, the tendency to touch/avoid ‘active areas’, and the high frequency of palm and proximal finger contact. Finally, we train state-of-the art image translation and 3D convolution algorithms to predict diverse contact patterns from object shape. Data, code and models are available at https://contactdb.cc.gatech.edu.

[1]  Manfred Lau,et al.  Tactile mesh saliency , 2016, ACM Trans. Graph..

[2]  U. Castiello The neuroscience of grasping , 2005, Nature Reviews Neuroscience.

[3]  J BeslPaul,et al.  A Method for Registration of 3-D Shapes , 1992 .

[4]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Oliver Brock,et al.  A novel type of compliant and underactuated robotic hand for dexterous grasping , 2016, Int. J. Robotics Res..

[6]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[7]  Tae-Kyun Kim,et al.  Latent Regression Forest: Structured Estimation of 3D Articulated Hand Posture , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Qi Ye,et al.  BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Silvio Savarese,et al.  Scene Semantic Reconstruction from Egocentric RGB-D-Thermal Videos , 2017, 2017 International Conference on 3D Vision (3DV).

[10]  Zoltan-Csaba Marton,et al.  Depth-based tracking with physical constraints for robot manipulation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Maximilian Baust,et al.  Learning in an Uncertain World: Representing Ambiguity Through Multiple Hypotheses , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Heni Ben Amor,et al.  Grasp Recognition with Uncalibrated Data Gloves - A Comparison of Classification Methods , 2007, 2007 IEEE Virtual Reality Conference.

[13]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Shuichi Akizuki,et al.  Tactile Logging for Understanding Plausible Tool Use Based on Human Demonstration , 2018, BMVC.

[15]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[16]  Michael Cogswell,et al.  Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles , 2016, NIPS.

[17]  Oliver Brock,et al.  Analysis and Observations From the First Amazon Picking Challenge , 2016, IEEE Transactions on Automation Science and Engineering.

[18]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[19]  Yi Li,et al.  Grasp type revisited: A modern perspective on a classical feature for vision , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[21]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[22]  C. Karen Liu,et al.  Synthesis of detailed hand manipulations using contact sampling , 2012, ACM Trans. Graph..

[23]  Nassir Navab,et al.  Dealing with Ambiguity in Robotic Grasping via Multiple Predictions , 2018, ACCV.

[24]  Deva Ramanan,et al.  Understanding Everyday Hands in Action from RGB-D Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  N. Kamakura,et al.  Patterns of static prehension in normal hands. , 1980, The American journal of occupational therapy : official publication of the American Occupational Therapy Association.

[27]  Daniela Rus,et al.  Learning Object Grasping for Soft Robot Hands , 2018, IEEE Robotics and Automation Letters.

[28]  Eli Brenner,et al.  On the relation between object shape and grasping kinematics. , 2004, Journal of neurophysiology.

[29]  Antti Oulasvirta,et al.  Real-Time Joint Tracking of a Hand Manipulating an Object from RGB-D Input , 2016, ECCV.

[30]  U. Castiello,et al.  How Objects Are Grasped: The Interplay between Affordances and End-Goals , 2011, PloS one.

[31]  Katsushi Ikeuchi,et al.  A sensor fusion approach for recognizing continuous human grasping sequences using hidden Markov models , 2005, IEEE Transactions on Robotics.

[32]  Antonis A. Argyros,et al.  Hand-Object Contact Force Estimation from Markerless Visual Tracking , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Lourdes Agapito,et al.  DiverseNet: When One Right Answer is not Enough , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ling Xu,et al.  Physical Human Interactive Guidance: Identifying Grasping Principles From Human-Planned Grasps , 2012, IEEE Trans. Robotics.

[36]  Mark R. Cutkosky,et al.  On grasp choice, grasp models, and the design of hands for manufacturing tasks , 1989, IEEE Trans. Robotics Autom..

[37]  Huseyin Atakan Varol,et al.  Human grasping database for activities of daily living with depth, color and kinematic data streams , 2018, Scientific Data.

[38]  Greg Turk,et al.  Simplification and Repair of Polygonal Models Using Volumetric Techniques , 2003, IEEE Trans. Vis. Comput. Graph..

[39]  Alberto Rodriguez,et al.  The complexities of grasping in the wild , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[40]  Yu Sun,et al.  Grasp planning based on strategy extracted from demonstration , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Luc Van Gool,et al.  An object-dependent hand pose prior from sparse training data , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  G. Planinšič Infrared Thermal Imaging: Fundamentals, Research and Applications , 2011 .

[43]  Shanxin Yuan,et al.  First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Luca Turella,et al.  An object for an action, the same object for other actions: effects on hand shaping , 2008, Experimental Brain Research.

[45]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[46]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[47]  Robert J. Wood,et al.  Soft Robotic Grippers for Biological Sampling on Deep Reefs , 2016, Soft robotics.

[48]  Kris M. Kitani,et al.  How do we use our hands? Discovering a diverse set of common grasps , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Eric C. Larson,et al.  HeatWave: thermal imaging for surface user interaction , 2011, CHI.

[50]  Aaron M. Dollar,et al.  The Yale human grasping dataset: Grasp, object, and task data in household and machine shop environments , 2015, Int. J. Robotics Res..

[51]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[52]  Vladlen Koltun,et al.  Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[53]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[54]  Oliver Brock,et al.  A compact representation of human single-object grasping , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[55]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).