论文信息 - EdgeNet: semantic scene completion from a single RGB-D image

EdgeNet: semantic scene completion from a single RGB-D image

Semantic scene completion is the task of predicting a complete 3D representation of volumetric occupancy with corresponding semantic labels for a scene from a single point of view. Previous works on Semantic Scene Completion from RGB-D data used either only depth or depth with colour by projecting the 2D image into the 3D volume resulting in a sparse data representation. In this work, we present a new strategy to encode colour information in 3D space using edge detection and flipped truncated signed distance. We also present EdgeNet, a new end-to-end neural network architecture capable of handling features generated from the fusion of depth and edge information. Experimental results show improvement of 6.9% over the state-of-the-art result on real data, for end-to-end approaches.

Adrian Hilton | Hansung Kim | Aloisio Dourado | Teofilo Emidio de Campos

[1] Roberto Cipolla,et al. SceneNet: Understanding Real World Indoor Scenes With Synthetic Data , 2015, ArXiv.

[2] Hongen Liao,et al. Efficient Semantic Scene Completion Network with Spatial Group Convolution , 2018, ECCV.

[3] Adrian Hilton,et al. Semantic Scene Completion Combining Colour and Depth: preliminary experiments , 2018, ArXiv.

[4] David Marr,et al. VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[5] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Jitendra Malik,et al. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Emile H. L. Aarts,et al. Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[8] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[9] Jian Sun,et al. Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Duc Thanh Nguyen,et al. A Field Model for Repairing 3D Shapes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[12] Jonathan T. Barron,et al. A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[13] Dieter Fox,et al. RGB-(D) scene labeling: Features and algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Sanja Fidler,et al. 3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Simon J. Julier,et al. Structured Prediction of Unobserved Voxels from a Single Depth Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[19] Leslie N. Smith,et al. A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay , 2018, ArXiv.

[20] Yu Hu,et al. See and Think: Disentangling Semantic Scene Completion , 2018, NeurIPS.

[21] Andrew Owens,et al. SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[22] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[23] Juan Song,et al. Semantic scene completion with dense CRF from a single depth image , 2018, Neurocomputing.

[24] Derek Hoiem,et al. Predicting Complete 3D Models of Indoor Scenes , 2015, ArXiv.

[25] Xin Tong,et al. View-Volume Network for Semantic Scene Completion from a Single Depth Image , 2018, IJCAI.

[26] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Juergen Gall,et al. Two Stream 3D Semantic Scene Completion , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).