3D point cloud segmentation oriented to the analysis of interactions

Given the widespread availability of point cloud data from consumer depth sensors, 3D point cloud segmentation becomes a promising building block for high level applications such as scene understanding and interaction analysis. It benefits from the richer information contained in real world 3D data compared to 2D images. This also implies that the classical color segmentation challenges have shifted to RGBD data, and new challenges have also emerged as the depth information is usually noisy, sparse and unorganized. Meanwhile, the lack of 3D point cloud ground truth labeling also limits the development and comparison among methods in 3D point cloud segmentation. In this paper, we present two contributions: a novel graph based point cloud segmentation method for RGBD stream data with interacting objects and a new ground truth labeling for a previously published data set [1]. This data set focuses on interaction (merge and split between `object' point clouds), which differentiates itself from the few existing labeled RGBD data sets which are more oriented to Simultaneous Localization And Mapping (SLAM) tasks. The proposed point cloud segmentation method is evaluated with the 3D point cloud ground truth labeling. Experiments show the promising result of our approach.

[1]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Javier Ruiz Hidalgo,et al.  Detecting end-effectors on 2.5D data using geometric deformable models: Application to human pose estimation , 2013, Comput. Vis. Image Underst..

[3]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Lucian Cosmin Goron,et al.  Reconstruction and Verification of 3D Object Models for Grasping , 2009, ISRR.

[5]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[7]  VekslerOlga,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[8]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[9]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Nico Blodow,et al.  Fast geometric point labeling using conditional random fields , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Hungary, August 29 - September 2, 2016 , 2016, European Signal Processing Conference.

[14]  Hedvig Kjellström,et al.  Audio-visual classification and detection of human manipulation actions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  James M. Keller,et al.  Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor , 2012, ACCV.

[16]  Chenliang Xu,et al.  Evaluation of super-voxel methods for early video processing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[18]  Florentin Wörgötter,et al.  Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, Vision.

[20]  Thorsten Joachims,et al.  Semantic Labeling of 3D Point Clouds for Indoor Scenes , 2011, NIPS.

[21]  Axel Gräser,et al.  Planar segmentation from depth images using gradient of depth feature , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Sven Behnke,et al.  Real-Time Plane Segmentation Using RGB-D Cameras , 2012, RoboCup.