Real-time human detection and tracking in complex environments using single RGBD camera

This paper presents a new approach to real-time human detection and tracking in cluttered and dynamic environments by integration of RGB and depth data. We introduce the notion of Point Ensemble Image, which fully encodes both RGB and depth information from a virtual plan-view perspective, and we reveal that human detection and tracking in 3D space can be performed very effectively based on this new representation. Our human detector is able to take advantage of depth data by effectively locate physically plausible candidates as a first step, and then both depth and color information is made full use of in a supervised learning manner at the second stage. 3D trajectories of humans are finally generated by data association in which joint statistics of color and height are computed and compared. Experimental results show that the system is able to work satisfactorily in complex real-world situations.

[1]  Rasmus Larsen,et al.  Improved 3D reconstruction in smart-room environments using ToF imaging , 2010, Comput. Vis. Image Underst..

[2]  Patricio A. Vela,et al.  Visual tracking and segmentation using Time-of-Flight sensor , 2010, 2010 IEEE International Conference on Image Processing.

[3]  Daniele Nardi,et al.  Real-time people localization and tracking through fixed stereo vision , 2005, Applied Intelligence.

[4]  Florian Schmidt,et al.  Integrating pedestrian simulation, tracking and event detection for crowd analysis , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[5]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Bastian Leibe,et al.  Close-Range Human Detection and Tracking for Head-Mounted Cameras , 2012, BMVC.

[7]  Kunihiro Chihara,et al.  People Detection and Tracking with World-Z Map from a Single Stereo Camera , 2008 .

[8]  Sung-Jea Ko,et al.  Robust people counting system based on sensor fusion , 2012, IEEE Transactions on Consumer Electronics.

[9]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Mei Han,et al.  A detection-based multiple object tracking method , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[11]  Luigi di Stefano,et al.  People Tracking Using a Time-of-Flight Depth Sensor , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[12]  Peter H. Tu,et al.  Detecting and counting people in surveillance applications , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Carlo Tomasi,et al.  People Detection Using Color and Depth Images , 2011, MCPR.

[15]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Dennis Mitzel Close-Range Human Detection for Head-Mounted Cameras , 2012 .

[17]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[18]  Michael Harville,et al.  Fast, integrated person tracking and activity recognition with plan-view templates from a single stereo camera , 2004, CVPR 2004.

[19]  Patrick Pérez,et al.  Color-Based Probabilistic Tracking , 2002, ECCV.