论文信息 - Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction

Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction

3D reconstruction from a single RGB image is a challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability. In this work, we focus on object-level 3D reconstruction and present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry. Our method fully utilizes the geometric cues from symmetry during the test time by building plane-sweep cost volumes, a powerful tool that has been used in multi-view stereopsis. To our knowledge, this is the first work that uses the concept of cost volumes in the setting of single-image 3D reconstruction. We conduct extensive experiments on the ShapeNet dataset and find that our reconstruction method significantly outperforms the previous state-of-the-art single-view 3D reconstruction networks in term of the accuracy of camera poses and depth maps, without requiring objects being completely symmetric. Code is available at this https URL.

[1] Edmond Boyer,et al. Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency , 2018, ECCV.

[2] Jan-Michael Frahm,et al. Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[3] Jing Xu,et al. Point-Based Multi-View Stereo Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5] Carlos Hernandez,et al. Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[6] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[7] Hans Knutsson,et al. Detecting rotational symmetries using normalized convolution , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[8] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[9] Wei Liu,et al. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[10] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[11] Joseph L. Mundy,et al. Repeated Structures: Image Correspondence Constraints and 3D Structure Recovery , 1993, Applications of Invariance in Computer Vision.

[12] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[13] Tao Guan,et al. P-MVSNet: Learning Patch-Wise Matching Confidence Aggregation for Multi-View Stereo , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Álvaro González. Measurement of Areas on a Sphere Using Fibonacci and Latitude–Longitude Lattices , 2009, 0912.4540.

[16] Luc Van Gool,et al. Learned Multi-patch Similarity , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] S. Shankar Sastry,et al. An Invitation to 3-D Vision: From Images to Geometric Models , 2003 .

[19] Richard Szeliski,et al. Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20] Narendra Ahuja,et al. DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Kihong Park,et al. Learning Descriptor, Confidence, and Depth Estimation in Multi-view Stereo , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22] Luc Van Gool,et al. Computational Symmetry in Computer Vision and Computer Graphics , 2010, Found. Trends Comput. Graph. Vis..

[23] Hao Su,et al. A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Thomas Brox,et al. What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Yong-Sheng Chen,et al. Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Andrea Vedaldi,et al. Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Jiansheng Chen,et al. MVSCRF: Learning Multi-View Stereo With Conditional Random Fields , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[29] Lu Fang,et al. SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30] Pascal Fua,et al. Efficient large-scale multi-view stereo for ultra high-resolution image sets , 2011, Machine Vision and Applications.

[31] Dacheng Tao,et al. Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Nahum Kiryati,et al. Detecting Symmetry in Grey Level Images: The Global Optimization Approach , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[33] Long Quan,et al. MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[34] Christopher Rasmussen,et al. Analysis of Building Textures for Reconstructing Partially Occluded Facades , 2008, ECCV.

[35] Yichao Zhou,et al. NeurVPS: Neural Vanishing Point Scanning via Conic Convolution , 2019, NeurIPS.

[36] Jitendra Malik,et al. Learning a Multi-View Stereo Machine , 2017, NIPS.

[37] Jiajun Wu,et al. Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.

[38] T. Poggio,et al. The importance of symmetry and virtual views in three-dimensional object recognition , 1994, Current Biology.

[39] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[40] Sebastian Nowozin,et al. Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Hagit Hel-Or,et al. Symmetry as a Continuous Feature , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[42] Duygu Ceylan,et al. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[43] Yi Zhou,et al. On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Giovanni Marola,et al. On the Detection of the Axes of Symmetry of Symmetric and Almost Symmetric Planar Images , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[45] Nikolaus F. Troje,et al. How is bilateral symmetry of human faces used for recognition of novel views? , 1998, Vision Research.

[46] Yizhou Yu,et al. Reconstruction of 3-D Symmetric Curves from Perspective Images without Discrete Features , 2004, ECCV.

[47] Alex Kendall,et al. End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48] Yann LeCun,et al. Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[49] Jan-Olof Eklundh,et al. Detecting Symmetry and Symmetric Constellations of Features , 2006, ECCV.

[50] Yun-Ta Tsai,et al. Portrait shadow manipulation , 2020, ACM Trans. Graph..

[51] Alexey Dosovitskiy,et al. Unsupervised Learning of Shape and Pose with Differentiable Point Clouds , 2018, NeurIPS.

[52] Konrad Schindler,et al. Just Look at the Image: Viewpoint-Specific Surface Normal Prediction for Improved Multi-View Reconstruction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Mathieu Aubry,et al. AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[54] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Honglak Lee,et al. Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[56] Thomas Brox,et al. Sparsity Invariant CNNs , 2017, 2017 International Conference on 3D Vision (3DV).

[57] Richard Szeliski,et al. Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.