Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction

3D reconstruction from a single RGB image is a challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability. In this work, we focus on object-level 3D reconstruction and present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry. Our method fully utilizes the geometric cues from symmetry during the test time by building plane-sweep cost volumes, a powerful tool that has been used in multi-view stereopsis. To our knowledge, this is the first work that uses the concept of cost volumes in the setting of single-image 3D reconstruction. We conduct extensive experiments on the ShapeNet dataset and find that our reconstruction method significantly outperforms the previous state-of-the-art single-view 3D reconstruction networks in term of the accuracy of camera poses and depth maps, without requiring objects being completely symmetric. Code is available at this https URL.

[1]  Edmond Boyer,et al.  Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency , 2018, ECCV.

[2]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[3]  Jing Xu,et al.  Point-Based Multi-View Stereo Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[6]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[7]  Hans Knutsson,et al.  Detecting rotational symmetries using normalized convolution , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[8]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[9]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[10]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[11]  Joseph L. Mundy,et al.  Repeated Structures: Image Correspondence Constraints and 3D Structure Recovery , 1993, Applications of Invariance in Computer Vision.

[12]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[13]  Tao Guan,et al.  P-MVSNet: Learning Patch-Wise Matching Confidence Aggregation for Multi-View Stereo , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Álvaro González Measurement of Areas on a Sphere Using Fibonacci and Latitude–Longitude Lattices , 2009, 0912.4540.

[16]  Luc Van Gool,et al.  Learned Multi-patch Similarity , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision: From Images to Geometric Models , 2003 .

[19]  Richard Szeliski,et al.  Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Kihong Park,et al.  Learning Descriptor, Confidence, and Depth Estimation in Multi-view Stereo , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Luc Van Gool,et al.  Computational Symmetry in Computer Vision and Computer Graphics , 2010, Found. Trends Comput. Graph. Vis..

[23]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Thomas Brox,et al.  What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Andrea Vedaldi,et al.  Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jiansheng Chen,et al.  MVSCRF: Learning Multi-View Stereo With Conditional Random Fields , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[29]  Lu Fang,et al.  SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Pascal Fua,et al.  Efficient large-scale multi-view stereo for ultra high-resolution image sets , 2011, Machine Vision and Applications.

[31]  Dacheng Tao,et al.  Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Nahum Kiryati,et al.  Detecting Symmetry in Grey Level Images: The Global Optimization Approach , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[33]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[34]  Christopher Rasmussen,et al.  Analysis of Building Textures for Reconstructing Partially Occluded Facades , 2008, ECCV.

[35]  Yichao Zhou,et al.  NeurVPS: Neural Vanishing Point Scanning via Conic Convolution , 2019, NeurIPS.

[36]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[37]  Jiajun Wu,et al.  Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.

[38]  T. Poggio,et al.  The importance of symmetry and virtual views in three-dimensional object recognition , 1994, Current Biology.

[39]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[40]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Hagit Hel-Or,et al.  Symmetry as a Continuous Feature , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Duygu Ceylan,et al.  DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[43]  Yi Zhou,et al.  On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Giovanni Marola,et al.  On the Detection of the Axes of Symmetry of Symmetric and Almost Symmetric Planar Images , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Nikolaus F. Troje,et al.  How is bilateral symmetry of human faces used for recognition of novel views? , 1998, Vision Research.

[46]  Yizhou Yu,et al.  Reconstruction of 3-D Symmetric Curves from Perspective Images without Discrete Features , 2004, ECCV.

[47]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[49]  Jan-Olof Eklundh,et al.  Detecting Symmetry and Symmetric Constellations of Features , 2006, ECCV.

[50]  Yun-Ta Tsai,et al.  Portrait shadow manipulation , 2020, ACM Trans. Graph..

[51]  Alexey Dosovitskiy,et al.  Unsupervised Learning of Shape and Pose with Differentiable Point Clouds , 2018, NeurIPS.

[52]  Konrad Schindler,et al.  Just Look at the Image: Viewpoint-Specific Surface Normal Prediction for Improved Multi-View Reconstruction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[54]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[56]  Thomas Brox,et al.  Sparsity Invariant CNNs , 2017, 2017 International Conference on 3D Vision (3DV).

[57]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.