Real-time people detection and tracking for indoor surveillance using multiple top-view depth cameras

This paper proposes a real-time indoor surveillance system which installs multiple depth cameras from vertical top-view to track humans. This system leads to a novel framework to solve the traditional challenge of surveillance through tracking of multiple persons, such as severe occlusion, similar appearance, illumination changes, and outline deformation. To cover the entire space of indoor surveillance scene, the image stitching based on the cameras' spatial relation is also utilized. The background subtraction of the stitched top-view image can then be performed to extract the foreground objects in the cluttered environment. The detection scheme including the graph-based segmentation, the head hemiellipsoid model, and the geodesic distance map are cascaded to detect humans. Moreover, the shape feature based on diffusion distance is designed to verify the human tracking hypotheses within particle filter. The experimental results demonstrate the real-time performance and robustness in comparison with several state-of-the-art detection and tracking algorithms.

[1]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  Rita Cucchiara,et al.  Detecting objects, shadows and ghosts in video streams by exploiting color and motion information , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[3]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[4]  Christian Micheloni,et al.  Video security for ambient intelligence , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Haibin Ling,et al.  Diffusion Distance for Histogram Comparison , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Luigi di Stefano,et al.  People Tracking Using a Time-of-Flight Depth Sensor , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[8]  Jean-Marc Odobez,et al.  Multi-Layer Background Subtraction Based on Color and Texture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hiroshi Ishiguro,et al.  Laser tracking of human body motion using adaptive shape modeling , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Monica N. Nicolescu,et al.  Understanding human intentions via Hidden Markov Models in autonomous mobile robots , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[11]  Hélène Laurent,et al.  Review and evaluation of commonly-implemented background subtraction algorithms , 2008, 2008 19th International Conference on Pattern Recognition.

[12]  Shuichi Nishio,et al.  Scalable and robust multi-people head tracking by combining distributed multiple sensors , 2010, Intell. Serv. Robotics.

[13]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[15]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Huchuan Lu,et al.  Superpixel tracking , 2011, 2011 International Conference on Computer Vision.

[17]  Li-Chen Fu,et al.  Comparison of granules features for pedestrian detection , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[18]  Nassir Navab,et al.  Human skeleton tracking from depth data using geodesic distances and optical flow , 2012, Image Vis. Comput..

[19]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Junjie Yan,et al.  Water Filling: Unsupervised People Counting via Vertical Kinect Sensor , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[21]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Manuela M. Veloso,et al.  Fast human detection for indoor mobile robots using depth images , 2013, 2013 IEEE International Conference on Robotics and Automation.