论文信息 - Coarse-to-fine segmentation for indoor scenes with progressive supervision

Coarse-to-fine segmentation for indoor scenes with progressive supervision

Abstract Three-dimensional indoor scene segmentation is highly difficult due to the natural hierarchical structures and complicated contextual relationships in the scenes. In this paper, a 3D scene segmentation method that uses a stacked network is proposed for utilizing the context and hierarchy in 3D scenes. The method consists of two parts: a stacked network and progressive supervision. The stacked network consists of multiple base segmentation networks, and each network's output is concatenated to the raw input as another network's input to provide a prior context. Progressive supervision includes a group of coarse-to-fine segmentation labels that are generated based on the spatial relationships among objects in the scene, and it forces the network to learn the hierarchy. The experimental results from a regular dataset and a complex dataset demonstrate that our progressive supervision is effective and that our method outperforms existing methods in complex scenes.

[1] David Marr,et al. VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[2] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Jiajun Wu,et al. MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[4] Zhen Wang,et al. A Multilevel Point-Cluster-Based Discriminative Feature for ALS Point Cloud Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[5] Michel F. Valstar,et al. A CNN Cascade for Landmark Guided Semantic Part Segmentation , 2016, ECCV Workshops.

[6] Duc Thanh Nguyen,et al. SceneNN: A Scene Meshes Dataset with aNNotations , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[7] Silvio Savarese,et al. SEGCloud: Semantic Segmentation of 3D Point Clouds , 2017, 2017 International Conference on 3D Vision (3DV).

[8] Luc Van Gool,et al. Weakly Supervised Cascaded Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Andrew E. Johnson,et al. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Silvio Savarese,et al. 3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Luc Van Gool,et al. 3D all the way: Semantic segmentation of urban scenes from start to end in 3D , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13] Matthias Nießner,et al. ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Bisheng Yang,et al. Computing multiple aggregation levels and contextual features for road facilities recognition using mobile laser scanning data , 2017 .

[16] Sanja Fidler,et al. 3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17] Matthias Nießner,et al. Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[18] Cewu Lu,et al. PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation , 2018, ArXiv.

[19] Zhen Wang,et al. A Multiscale and Hierarchical Feature Extraction Method for Terrestrial Laser Scanning Point Cloud Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[20] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.