Scene Segmentation Driven by Deep Learning and Surface Fitting

This paper proposes a joint color and depth segmentation scheme exploiting together geometrical clues and a learning stage. The approach starts from an initial over-segmentation based on spectral clustering. The input data is also fed to a Convolutional Neural Network (CNN) thus producing a per-pixel descriptor vector for each scene sample. An iterative merging procedure is then used to recombine the segments into the regions corresponding to the various objects and surfaces. The proposed algorithm starts by considering all the adjacent segments and computing a similarity metric according to the CNN features. The couples of segments with higher similarity are considered for merging. Finally the algorithm uses a NURBS surface fitting scheme on the segments in order to understand if the selected couples correspond to a single surface. The comparison with state-of-the-art methods shows how the proposed method provides an accurate and reliable scene segmentation.

[1]  Michael Felsberg,et al.  Channel Coding for Joint Colour and Depth Segmentation , 2011, DAGM-Symposium.

[2]  Alain Trémeau,et al.  Joint Color-Spatial-Directional Clustering and Region Merging (JCSD-RM) for Unsupervised RGB-D Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Les A. Piegl,et al.  The NURBS book (2nd ed.) , 1997 .

[5]  Laurent D. Cohen,et al.  Combination of Piecewise-Geodesic Paths for Interactive Segmentation , 2014, International Journal of Computer Vision.

[6]  Ludovico Minto,et al.  Time-of-Flight and Structured Light Depth Cameras , 2016, Springer International Publishing.

[7]  Yann LeCun,et al.  Convolutional nets and watershed cuts for real-time semantic Labeling of RGBD videos , 2014, J. Mach. Learn. Res..

[8]  Anthony Cowley,et al.  Parsing Indoor Scenes Using RGB-D Imagery , 2012, Robotics: Science and Systems.

[9]  Guido M. Cortelazzo,et al.  Fusion of Geometry and Color Information for Scene Segmentation , 2012, IEEE Journal of Selected Topics in Signal Processing.

[10]  Frank Dellaert,et al.  A Rao-Blackwellized MCMC algorithm for recovering piecewise planar 3D models from multiple view RGBD images , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[11]  Jitendra Malik,et al.  Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation , 2015, International Journal of Computer Vision.

[12]  Dieter Fox,et al.  RGB-(D) scene labeling: Features and algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Longin Jan Latecki,et al.  Semantic Segmentation of RGBD Images with Mutex Constraints , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Michael Werman,et al.  Fusing Time-of-Flight Depth and Color for Real-Time Segmentation and Tracking , 2009, Dyn3D.

[15]  Pietro Zanuttigh,et al.  Joint Color and Depth Segmentation based on Region Merging and Surface Fitting , 2016, VISIGRAPP.

[16]  Yann LeCun,et al.  Indoor Semantic Segmentation using depth information , 2013, ICLR.

[17]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[18]  Sven Behnke,et al.  Fast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks , 2014, KI.

[19]  Frank Dellaert,et al.  Planar Segmentation of RGBD Images Using Fast Linear Fitting and Markov Chain Monte Carlo , 2012, 2012 Ninth Conference on Computer and Robot Vision.

[20]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Pietro Zanuttigh,et al.  Scene segmentation from depth and color data driven by surface fitting , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[23]  Gang Wang,et al.  Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling , 2014, ECCV.

[24]  Jitendra Malik,et al.  Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[26]  Alain Trémeau,et al.  Unsupervised RGB-D image segmentation using joint clustering and region merging , 2014, BMVC.

[27]  Irfan A. Essa,et al.  Semantic Instance Labeling Leveraging Hierarchical Segmentation , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[28]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Stefano Mattoccia,et al.  Scene Segmentation Assisted by Stereo Vision , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[30]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.