论文信息 - SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation

SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation

We introduce a novel deep neural network to estimate a depth map from a single monocular indoor panorama. The network directly works on the equirectangular projection, exploiting the properties of indoor 360° images. Starting from the fact that gravity plays an important role in the design and construction of man-made indoor scenes, we propose a compact representation of the scene into vertical slices of the sphere, and we exploit long- and short-term relationships among slices to recover the equirectangular depth map. Our design makes it possible to maintain high-resolution information in the extracted features even with a deep network. The experimental results demonstrate that our method outperforms current state-of-the-art solutions in prediction accuracy, particularly for real-world data.

[1] Yinda Zhang,et al. PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding , 2014, ECCV.

[2] T. Kanade,et al. Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Fu-En Wang,et al. BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Matthias Nießner,et al. State of the Art on 3D Reconstruction with RGB‐D Cameras , 2018, Comput. Graph. Forum.

[5] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[6] Guosheng Lin,et al. Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Jean-Charles Bazin,et al. Deep360Up: A Deep Learning-Based Approach for Automatic VR Image Upright Adjustment , 2019, 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR).

[8] Oisin Mac Aodha,et al. Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Shi Jin,et al. Automatic 3D Indoor Scene Modeling from Single Panorama , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Sophie Lambert-Lacroix,et al. The adaptive BerHu penalty in robust regression , 2016 .

[11] Nassir Navab,et al. Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images , 2018, ECCV.

[12] Wei Zeng,et al. Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image , 2020, ECCV.

[13] Nassir Navab,et al. Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[14] Alan L. Yuille,et al. Towards unified depth and semantic prediction from a single image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Renato Pajarola,et al. Automatic 3D reconstruction of structured indoor environments , 2020, SIGGRAPH Courses.

[16] Derek Hoiem,et al. LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Nicu Sebe,et al. Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Kristen Grauman,et al. Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery , 2017, NIPS 2017.

[19] Enrico Gobbetti,et al. AtlantaNet: Inferring the 3D Indoor Layout from a Single $360^\circ $ Image Beyond the Manhattan World Assumption , 2020, ECCV.

[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Toby P. Breckon,et al. Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery , 2018, ECCV.

[22] Min Sun,et al. Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23] Shugong Xu,et al. Geometric Structure Based and Regularized Depth Estimation From 360 Indoor Imagery , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Zihan Zhou,et al. Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling , 2019, ECCV.

[25] Cheng Sun,et al. HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Ashutosh Saxena,et al. Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Dacheng Tao,et al. Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29] Chang-Su Kim,et al. Single-Image Depth Estimation Based on Fourier Domain Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30] Petros Daras,et al. Spherical View Synthesis for Self-Supervised 360° Depth Estimation , 2019, 2019 International Conference on 3D Vision (3DV).

[31] Ian D. Reid,et al. Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Chunhua Shen,et al. Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[33] João F. Henriques,et al. 360 Camera Alignment via Segmentation , 2020 .

[34] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[35] Kristen Grauman,et al. Kernel Transformer Networks for Compact Spherical Convolution , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Kuk-Jin Yoon,et al. SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360° Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[39] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[40] Peter Wonka,et al. DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Enrico Gobbetti,et al. State‐of‐the‐art in Automatic 3D Reconstruction of Structured Indoor Environments , 2020, Comput. Graph. Forum.

[42] Petros Daras,et al. OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas , 2018, ECCV.

[43] Li Guan,et al. Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network , 2019, 2019 International Conference on 3D Vision (3DV).

[44] Matthew Fisher,et al. UprightNet: Geometry-Aware Camera Orientation Estimation From Single Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45] Chunhua Shen,et al. Enforcing Geometric Constraints of Virtual Normal for Depth Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).