3 D Point Cloud Segmentation Using a Fully Connected Conditional Random Field

Traditional image segmentation methods working with low level image features are usually difficult to adapt to higher level tasks, such as object recognition and scene understanding. Object segmentation emerges as a new challenge in this research field. It aims at obtaining more meaningful segments related to semantic objects in the scene by analyzing a combination of different information. 3D point cloud data obtained from consumer depth sensors has been exploited to tackle many computer vision problems due to its richer information about the geometry of 3D scenes compared to 2D images. Meanwhile, new challenges have also emerged as the depth information is usually noisy, sparse and unorganized. In this paper, we present a novel point cloud segmentation approach for segmenting interacting objects in a stream of point clouds by exploiting spatio-temporal coherence. We pose the problem as an energy minimization task in a fully connected conditional random field with the energy function defined based on both current and previous information. We compare different methods and prove the improved segmentation performance and robustness of the proposed approach in sequences with over 2k frames.

[1]  Javier Ruiz Hidalgo,et al.  Detecting end-effectors on 2.5D data using geometric deformable models: Application to human pose estimation , 2013, Comput. Vis. Image Underst..

[2]  Stefano Soatto,et al.  Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[3]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hedvig Kjellström,et al.  Audio-visual classification and detection of human manipulation actions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Eren Erdal Aksoy,et al.  3d semantic representation of actions from effcient stereo-image-sequence segmentation on GPUs , 2010 .

[6]  Xiao Lin,et al.  3D point cloud segmentation oriented to the analysis of interactions , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[7]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Florentin Wörgötter,et al.  Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Stephen Lin,et al.  Object-Based Multiple Foreground Segmentation in RGBD Video , 2017, IEEE Transactions on Image Processing.

[11]  Atsushi Nakazawa,et al.  Motion Coherent Tracking Using Multi-label MRF Optimization , 2012, International Journal of Computer Vision.

[12]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[13]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[14]  Babette Dellen,et al.  Depth-supported real-time video segmentation with the Kinect , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[15]  Irfan A. Essa,et al.  Efficient Hierarchical Graph-Based Segmentation of RGBD Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xiao Lin,et al.  3D Point Cloud Video Segmentation Based on Interaction Analysis , 2016, ECCV Workshops.

[17]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.