Foreground Segmentation in Depth Imagery Using Depth and Spatial Dynamic Models for Video Surveillance Applications

Low-cost systems that can obtain a high-quality foreground segmentation almost independently of the existing illumination conditions for indoor environments are very desirable, especially for security and surveillance applications. In this paper, a novel foreground segmentation algorithm that uses only a Kinect depth sensor is proposed to satisfy the aforementioned system characteristics. This is achieved by combining a mixture of Gaussians-based background subtraction algorithm with a new Bayesian network that robustly predicts the foreground/background regions between consecutive time steps. The Bayesian network explicitly exploits the intrinsic characteristics of the depth data by means of two dynamic models that estimate the spatial and depth evolution of the foreground/background regions. The most remarkable contribution is the depth-based dynamic model that predicts the changes in the foreground depth distribution between consecutive time steps. This is a key difference with regard to visible imagery, where the color/gray distribution of the foreground is typically assumed to be constant. Experiments carried out on two different depth-based databases demonstrate that the proposed combination of algorithms is able to obtain a more accurate segmentation of the foreground/background than other state-of-the art approaches.

[1]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[2]  R. Real,et al.  The Probabilistic Basis of Jaccard's Index of Similarity , 1996 .

[3]  L. Andreone,et al.  Developing a near infrared based night vision system , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[4]  Narciso García,et al.  Bayesian visual surveillance: A model for detecting and tracking a variable number of moving objects , 2011, 2011 18th IEEE International Conference on Image Processing.

[5]  James W. Davis,et al.  Background-Subtraction in Thermal Imagery Using Contour Saliency , 2007, International Journal of Computer Vision.

[6]  J. M. Mossi,et al.  Who is who at different cameras: people re-identification using depth cameras , 2012 .

[7]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[8]  Narciso García Santos,et al.  Bayesian Visual Surveillance, a Model for Detecting and Tracking a variable number of moving objects , 2011, ICIP 2011.

[9]  Gerhard Rigoll,et al.  Depth gradient based segmentation of overlapping foreground objects in range images , 2010, 2010 13th International Conference on Information Fusion.

[10]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Eduardo Ros,et al.  Background Subtraction Based on Color and Depth Using Active Sensors , 2013, Sensors.

[12]  Luis Salgado,et al.  Depth-Color Fusion Strategy for 3-D Scene Modeling With Kinect , 2013, IEEE Transactions on Cybernetics.

[13]  Allen R. Hanson,et al.  Background modeling using adaptive pixelwise kernel variances in a hybrid feature space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Lucia Maddalena,et al.  A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications , 2008, IEEE Transactions on Image Processing.

[15]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[16]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[17]  Javier Lorenzo-Navarro,et al.  On the Use of Simple Geometric Descriptors Provided by RGB-D Sensors for Re-Identification , 2013, Sensors.

[18]  Changhu Wang,et al.  Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[20]  Michael Beetz,et al.  Evaluation of Hierarchical Sampling Strategies in 3D Human Pose Estimation , 2008, BMVC.

[21]  Matti Pietikäinen,et al.  Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[23]  Luis Salgado,et al.  Background foreground segmentation with RGB-D Kinect data: An efficient combination of classifiers , 2014, J. Vis. Commun. Image Represent..

[24]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Yongtian Wang,et al.  A Coded Aperture Compressive Imaging Array and Its Visual Detection and Tracking Algorithms for Surveillance Systems , 2012, Sensors.

[26]  Norihiro Hagita,et al.  Deciphering the Crowd: Modeling and Identification of Pedestrian Group Motion , 2013, Sensors.

[27]  Gerhard Rigoll,et al.  Background segmentation with feedback: The Pixel-Based Adaptive Segmenter , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  Sergio Escalera,et al.  Multi-modal user identification and object recognition surveillance system , 2013, Pattern Recognit. Lett..

[29]  Marc Van Droogenbroeck,et al.  Combining Color, Depth, and Motion for Video Segmentation , 2009, ICVS.

[30]  Fatih Murat Porikli,et al.  Changedetection.net: A new change detection benchmark dataset , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[31]  James W. Davis,et al.  Robust Background-Subtraction for Person Detection in Thermal Imagery , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[32]  Antonio Fernández-Caballero,et al.  Optical flow or image subtraction in human detection from infrared camera on mobile robot , 2010, Robotics Auton. Syst..

[33]  Karthikeyan Umapathy,et al.  Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking , 2010, EURASIP J. Adv. Signal Process..

[34]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[36]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[37]  Luiz Velho,et al.  Kinect and RGBD Images: Challenges and Applications , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images Tutorials.

[38]  Vittorio Murino,et al.  Background Subtraction for Automated Multisensor Surveillance: A Comprehensive Review , 2010, EURASIP J. Adv. Signal Process..