Unsupervised RGB-D image segmentation using joint clustering and region merging

Recent advances in imaging sensors, such as Kinect, provide access to the synchronized depth with color, called RGB-D image. In this paper, we propose an unsupervised method for indoor RGB-D image segmentation and analysis. We consider a statistical image generation model based on the color and geometry of the scene. Our method consists of a joint color-spatial-axial clustering method followed by a statistical planar region merging method. We evaluate our method on the NYU depth database V2 (NYUD2) and compare with existing unsupervised RGB-D segmentation methods. Results show that our method is comparable with the state of the art methods. Moreover, it opens interesting perspectives for fusing color and geometry in an unsupervised manner.

[1]  Frank Nielsen,et al.  Statistical region merging , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Thorsten Joachims,et al.  Contextually Guided Semantic Labeling and Search for 3D Point Clouds , 2011, ArXiv.

[3]  Guido M. Cortelazzo,et al.  Fusion of Geometry and Color Information for Scene Segmentation , 2012, IEEE Journal of Selected Topics in Signal Processing.

[4]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[5]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Q. M. Jonathan Wu,et al.  Fast and Robust Spatially Constrained Gaussian Mixture Model for Image Segmentation , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Edwin Olson,et al.  Graph-based segmentation for colored 3D laser point clouds , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Olivier Alata,et al.  Is there a best color space for color image characterization or representation based on Multivariate Gaussian Mixture Model? , 2009, Comput. Vis. Image Underst..

[9]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[10]  Dieter Fox,et al.  RGB-(D) scene labeling: Features and algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Md Hierarchical 3-D von Mises-Fisher Mixture Model , 2013 .

[12]  MalikJitendra,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2016 .

[13]  Shi-Min Hu,et al.  Global Contrast Based Salient Region Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Jitendra Malik,et al.  Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[16]  John Wright,et al.  Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xavier Cufí,et al.  Yet Another Survey on Image Segmentation: Region and Boundary Information Integration , 2002, ECCV.

[18]  Jitendra Malik,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2013, CVPR.

[19]  Alain Trémeau,et al.  Regions adjacency graph applied to color image segmentation , 2000, IEEE Trans. Image Process..

[20]  Adolfo Martínez Usó,et al.  Unsupervised colour image segmentation by low-level perceptual grouping , 2011, Pattern Analysis and Applications.

[21]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[22]  David Zhang,et al.  Automatic Image Segmentation by Dynamic Region Merging , 2010, IEEE Transactions on Image Processing.

[23]  Thorsten Joachims,et al.  Semantic Labeling of 3D Point Clouds for Indoor Scenes , 2011, NIPS.

[24]  Anthony Cowley,et al.  Parsing Indoor Scenes Using RGB-D Imagery , 2012, Robotics: Science and Systems.

[25]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Frank Nielsen,et al.  Statistical exponential families: A digest with flash cards , 2009, ArXiv.

[27]  Alain Trémeau,et al.  Unsupervised Clustering of Depth Images Using Watson Mixture Model , 2014, 2014 22nd International Conference on Pattern Recognition.

[28]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[29]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[30]  Radu Bogdan Rusu,et al.  Semantic 3D Object Maps for Everyday Robot Manipulation , 2013, Springer Tracts in Advanced Robotics.

[31]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Frank Nielsen,et al.  Simplification and hierarchical representations of mixtures of exponential families , 2010 .

[33]  Suvrit Sra,et al.  The multivariate Watson distribution: Maximum-likelihood estimation and other aspects , 2011, J. Multivar. Anal..