论文信息 - EdgeNet: Semantic Scene Completion from RGB-D images

EdgeNet: Semantic Scene Completion from RGB-D images

Semantic scene completion is the task of predicting a complete 3D representation of volumetric occupancy with corresponding semantic labels for a scene from a single point of view. Previous works on Semantic Scene Completion from RGB-D data used either only depth or depth with colour by projecting the 2D image into the 3D volume resulting in a sparse data representation. In this work, we present a new strategy to encode colour information in 3D space using edge detection and flipped truncated signed distance. We also present EdgeNet, a new end-to-end neural network architecture capable of handling features generated from the fusion of depth and edge information. Experimental results show improvement of 6.9% over the state-of-the-art result on real data, for end-to-end approaches.

[1] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] R. Weale. Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[4] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[5] Jonathan T. Barron,et al. A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[6] Duc Thanh Nguyen,et al. A Field Model for Repairing 3D Shapes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Andrew Owens,et al. SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[8] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9] Roberto Cipolla,et al. SceneNet: Understanding Real World Indoor Scenes With Synthetic Data , 2015, ArXiv.

[10] Hongen Liao,et al. Efficient Semantic Scene Completion Network with Spatial Group Convolution , 2018, ECCV.

[11] Leslie N. Smith,et al. A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay , 2018, ArXiv.

[12] Adrian Hilton,et al. Semantic Scene Completion Combining Colour and Depth: preliminary experiments , 2018, ArXiv.

[13] Jitendra Malik,et al. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Derek Hoiem,et al. Predicting Complete 3D Models of Indoor Scenes , 2015, ArXiv.

[15] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[16] C. McDiarmid. SIMULATED ANNEALING AND BOLTZMANN MACHINES A Stochastic Approach to Combinatorial Optimization and Neural Computing , 1991 .

[17] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Simon J. Julier,et al. Structured Prediction of Unobserved Voxels from a Single Depth Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[20] Dieter Fox,et al. RGB-(D) scene labeling: Features and algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jianxiong Xiao,et al. SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Xin Tong,et al. View-Volume Network for Semantic Scene Completion from a Single Depth Image , 2018, IJCAI.

[24] Yu Hu,et al. See and Think: Disentangling Semantic Scene Completion , 2018, NeurIPS.

[25] Juergen Gall,et al. Two Stream 3D Semantic Scene Completion , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26] Sanja Fidler,et al. 3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).