3D Shape Perception from Monocular Vision, Touch, and Shape Priors

Perceiving accurate 3D object shape is important for robots to interact with the physical world. Current research along this direction has been primarily relying on visual observations. Vision, however useful, has inherent limitations due to occlusions and the 2D-3D ambiguities, especially for perception with a monocular camera. In contrast, touch gets precise local shape information, though its efficiency for reconstructing the entire shape could be low. In this paper, we propose a novel paradigm that efficiently perceives accurate 3D object shape by incorporating visual and tactile observations, as well as prior knowledge of common object shapes learned from large-scale shape repositories. We use vision first, applying neural networks with learned shape priors to predict an object's 3D shape from a single-view color image. We then use tactile sensing to refine the shape; the robot actively touches the object regions where the visual prediction has high uncertainty. Our method efficiently builds the 3D shape of common objects from a color image and a small number of tactile explorations (around 10). Our setup is easy to apply and has potentials to help robots better perform grasping or manipulation tasks on real-world objects.

[1]  Ravinder Dahiya,et al.  Robotic Tactile Perception of Object Properties: A Review , 2017, ArXiv.

[2]  Jitendra Malik,et al.  Category-specific object reconstruction from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[4]  Jiajun Wu,et al.  Self-Supervised Intrinsic Image Decomposition , 2017, NIPS.

[5]  Marc Toussaint,et al.  Gaussian process implicit surfaces for shape estimation and grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Andrew T. Miller,et al.  Integration of Vision , Force and Tactile Sensing for Grasping , 1999 .

[7]  Pietro Falco,et al.  Cross-modal visuo-tactile object recognition using robotic active exploration , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[8]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[9]  Matei T. Ciocarlie,et al.  GP-GPIS-OPT: Grasp planning with shape uncertainty using Gaussian process implicit surfaces and Sequential Convex Programming , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Tamim Asfour,et al.  Local implicit surface estimation for haptic exploration , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[11]  Sebastian Thrun,et al.  Shape from symmetry , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Edward H. Adelson,et al.  GelSight: High-Resolution Robot Tactile Sensors for Estimating Geometry and Force , 2017, Sensors.

[13]  Rüdiger Dillmann,et al.  Robust shape recovery for sparse contact location and normal data from haptic exploration , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Jiajun Wu,et al.  Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Peter Englert,et al.  Active learning with query paths for tactile object shape exploration , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[18]  Chad DeChant,et al.  Shape completion enabled robotic grasping , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Anthony G. Cohn,et al.  ViTac: Feature Sharing Between Vision and Tactile Sensing for Cloth Texture Recognition , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Peter Allen,et al.  Visual-Tactile Geometric Reasoning , 2010 .

[21]  Andrew Owens,et al.  The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes? , 2017, CoRL.

[22]  Jiajun Wu,et al.  Synthesizing 3D Shapes via Modeling Multi-view Depth Maps and Silhouettes with Deep Generative Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Natale Lorenzo,et al.  Active perception: Building objects' models using tactile exploration , 2016 .

[24]  Frédo Durand,et al.  Non-iterative, feature-preserving mesh smoothing , 2003, ACM Trans. Graph..

[25]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[26]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[27]  Jeannette Bohg,et al.  Three-dimensional object reconstruction of symmetric objects by fusing visual and tactile sensing , 2014, Int. J. Robotics Res..

[28]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Simon J. Julier,et al.  Structured Prediction of Unobserved Voxels from a Single Depth Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Danica Kragic,et al.  Enhancing visual perception of shape through tactile glances , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[32]  Kaspar Althoefer,et al.  Localizing the object contact through matching tactile features with visual map , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Christoph H. Lampert,et al.  Learning Dynamic Tactile Sensing With Robust Vision-Based Training , 2011, IEEE Transactions on Robotics.

[34]  Matthias Nießner,et al.  Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Kaspar Althoefer,et al.  Iterative Closest Labeled Point for tactile object shape recognition , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Jan Peters,et al.  Active tactile object exploration with Gaussian processes , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Gregory D. Hager,et al.  Object mapping, recognition, and localization from tactile geometry , 2011, 2011 IEEE International Conference on Robotics and Automation.

[39]  Edward H. Adelson,et al.  Improved GelSight tactile sensor for measuring geometry and slip , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[40]  Oliver Brock,et al.  Interactive Perception: Leveraging Action in Perception and Perception in Action , 2016, IEEE Transactions on Robotics.

[41]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[42]  Danica Kragic,et al.  Strategies for multi-modal scene exploration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Takamitsu Matsubara,et al.  Active tactile exploration with uncertainty and travel cost for fast shape estimation of unknown objects , 2017, Robotics Auton. Syst..

[44]  Giorgio Metta,et al.  Active contour following to explore object shape with robot touch , 2013, 2013 World Haptics Conference (WHC).

[45]  Aude Billard,et al.  Bimanual compliant tactile exploration for grasping unknown objects , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Edward H. Adelson,et al.  Connecting Look and Feel: Associating the Visual and Tactile Properties of Physical Materials , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Edward Adelson,et al.  Tracking objects with point clouds from vision and touch , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Jiajun Wu,et al.  Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.