Reliable Human Detection and Tracking in Top-View Depth Images

The paper presents a method for human detection and tracking in depth images captured by a top-view camera system. We introduce a new feature descriptor which outperforms state-of-the-art features like Simplified Local Ternary Patterns in the given scenario. We use this feature descriptor to train a head-shoulder detector using a discriminative class scheme. A separate processing step ensures that only a minimal but sufficient number of head-shoulder candidates is evaluated. This contributes to an excellent runtime performance. A final tracking step reliably propagates detections in time and provides stable tracking results. The quality of the presented method allows us to recognize many challenging situations with humans tailgating and piggybacking.

[1]  Jake K. Aggarwal,et al.  Head segmentation and head orientation in 3D space for pose estimation of multiple people , 2000, 4th IEEE Southwest Symposium on Image Analysis and Interpretation.

[2]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Shiqi Yu,et al.  An attempt to pedestrian detection in depth images , 2011, 2011 Third Chinese Conference on Intelligent Visual Surveillance.

[4]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Yan Guo,et al.  Real-time stereo tracking of multiple moving heads , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[6]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  David Beymer,et al.  Person counting using stereo , 2000, Proceedings Workshop on Human Motion.

[10]  Liyuan Li,et al.  Stereo-based human head detection from crowd scenes , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[11]  Tuan Q. Pham Non-maximum Suppression Using Fewer than Two Comparisons per Pixel , 2010, ACIVS.

[12]  Liang Wang,et al.  SLTP: A Fast Descriptor for People Detection in Depth Images , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[13]  Louahdi Khoudour,et al.  A People Counting System Based on Dense and Close Stereovision , 2008, ICISP.

[14]  Ben J. A. Kröse,et al.  Head Detection in Stereo Data for People Counting and Segmentation , 2011, VISAPP.