Multiple human tracking in RGB-depth data: a survey

Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[3]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Graeme A. Jones,et al.  A Depth-based Polar Coordinate System for People Segmentation and Tracking with Multiple RGB-D Sensors , 2014 .

[5]  Kehar Singh,et al.  Modified geometry of ring-wedge detector for sampling Fourier transform of fingerprints for classification using neural networks , 2003, International Commission for Optics.

[6]  David R. Chambers,et al.  High-accuracy real-time pedestrian detection system using 2D and 3D features , 2012, Defense, Security, and Sensing.

[7]  Anton Kummert,et al.  Applications for a people detection and tracking algorithm using a time-of-flight camera , 2014, Multimedia Tools and Applications.

[8]  Jing Wang,et al.  Online learning 3D context for robust visual tracking , 2015, Neurocomputing.

[9]  Bastian Leibe,et al.  Real-time RGB-D based people detection and tracking for mobile robots and head-worn cameras , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Ingemar J. Cox,et al.  An Efficient Implementation of Reid's Multiple Hypothesis Tracking Algorithm and Its Evaluation for the Purpose of Visual Tracking , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Rafael Muñoz-Salinas,et al.  A Bayesian plan-view map based approach for multiple-person detection and tracking , 2008, Pattern Recognit..

[12]  Fabien Cardinaux,et al.  Video based technology for ambient assisted living: A review of the literature , 2011, J. Ambient Intell. Smart Environ..

[13]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[14]  Huosheng Hu,et al.  Human motion tracking for rehabilitation - A survey , 2008, Biomed. Signal Process. Control..

[15]  J. Miura,et al.  Robust Stereo-Based Person Detection and Tracking for a Person Following Robot , 2009 .

[16]  Rafael Muñoz-Salinas,et al.  People detection and tracking using stereo vision and color , 2007, Image Vis. Comput..

[17]  Fakhreddine Ababsa,et al.  Hybrid 3D–2D human tracking in a top view , 2014, Journal of Real-Time Image Processing.

[18]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Qixiang Ye,et al.  Real-Time Multipedestrian Tracking in Traffic Scenes via an RGB-D-Based Layered Graph Model , 2015, IEEE Transactions on Intelligent Transportation Systems.

[20]  Bastian Leibe,et al.  Close-Range Human Detection and Tracking for Head-Mounted Cameras , 2012, BMVC.

[21]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[22]  Alexandros André Chaaraoui,et al.  A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living , 2012, Expert Syst. Appl..

[23]  Andrzej Czyzewski,et al.  Performance evaluation of video object tracking algorithm in autonomous surveillance system , 2010, 2010 2nd International Conference on Information Technology, (2010 ICIT).

[24]  Andreas Zell,et al.  Real time face detection using geometric constraints, navigation and depth-based skin segmentation on mobile robots , 2012, 2012 IEEE International Symposium on Robotic and Sensors Environments Proceedings.

[25]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[26]  Carlo Tomasi,et al.  People Detection Using Color and Depth Images , 2011, MCPR.

[27]  Niall Twomey,et al.  Bridging e-Health and the Internet of Things: The SPHERE Project , 2015, IEEE Intelligent Systems.

[28]  Kai Oliver Arras,et al.  Multi-model hypothesis tracking of groups of people in RGB-D data , 2014, 17th International Conference on Information Fusion (FUSION).

[29]  Mayank Bansal,et al.  A real-time pedestrian detection system based on structure and appearance classification , 2010, 2010 IEEE International Conference on Robotics and Automation.

[30]  Lynne E. Parker,et al.  Real-Time Multiple Human Perception With Color-Depth Cameras on a Mobile Robot , 2013, IEEE Transactions on Cybernetics.

[31]  Peter H. N. de With,et al.  Employing a RGB-D sensor for real-time tracking of humans across multiple re-entries in a smart environment , 2012, IEEE Transactions on Consumer Electronics.

[32]  Rafael Muñoz-Salinas,et al.  Adaptive multi-modal stereo people tracking without background modelling , 2008, J. Vis. Commun. Image Represent..

[33]  Konrad Schindler,et al.  Multi-Target Tracking by Discrete-Continuous Energy Minimization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Nadir Weibel,et al.  The VideoMob Interactive Art Installation Connecting Strangers through Inclusive Digital Crowds , 2015, ACM Trans. Interact. Intell. Syst..

[35]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[36]  Luis Salgado,et al.  Advanced background modeling with RGB-D sensors through classifiers combination and inter-frame foreground prediction , 2014, Machine Vision and Applications.

[37]  Matteo Munaro,et al.  Tracking people within groups with RGB-D data , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  Wenhan Luo,et al.  Multiple Object Tracking: A Review , 2014, ArXiv.

[39]  Luc Van Gool,et al.  Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Dimitris Samaras,et al.  Two-person interaction detection using body-pose features and multiple instance learning , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[41]  Michael Harville,et al.  Stereo person tracking with adaptive plan-view templates of height and occupancy statistics , 2004, Image Vis. Comput..

[42]  Luc Van Gool,et al.  Robust Multiperson Tracking from a Mobile Platform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Sandipan P. Narote,et al.  Real-time human detection and tracking , 2013, 2013 Annual IEEE India Conference (INDICON).

[44]  Bastian Leibe,et al.  Efficient Use of Geometric Constraints for Sliding-Window Object Detection in Video , 2011, ICVS.

[45]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[46]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[47]  James J. Little,et al.  Learning to Track and Identify Players from Broadcast Sports Videos , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Matteo Munaro,et al.  RGB-D Human Detection and Tracking for Industrial Environments , 2014, IAS.

[49]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[51]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[52]  Jing Zhang,et al.  RGB-D-based action recognition datasets: A survey , 2016, Pattern Recognit..

[53]  Jianxiong Xiao,et al.  Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines , 2013, 2013 IEEE International Conference on Computer Vision.

[54]  Monique Thonnat,et al.  Event Recognition System for Older People Monitoring Using an RGB-D Camera , 2013 .

[55]  Stefano Soatto,et al.  Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[56]  Suchi Saria,et al.  Deformable Distributed Multiple Detector Fusion for Multi-Person Tracking , 2015, ArXiv.

[57]  Hong Liu,et al.  Depth Motion Detection—A Novel RS-Trigger Temporal Logic based Method , 2014, IEEE Signal Processing Letters.

[58]  Emilio J. Almazan,et al.  Tracking People across Multiple Non-overlapping RGB-D Sensors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[59]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Marjorie Skubic,et al.  Fall Detection in Homes of Older Adults Using the Microsoft Kinect , 2015, IEEE Journal of Biomedical and Health Informatics.

[61]  Trevor Darrell,et al.  Background estimation and removal based on range and color , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[62]  Manoranjan Paul,et al.  Human detection in surveillance videos and its applications - a review , 2013, EURASIP J. Adv. Signal Process..

[63]  Michael Harville,et al.  Foreground segmentation using adaptive mixture models in color and depth , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[64]  Xenophon Zabulis,et al.  Tracking persons using a network of RGBD cameras , 2014, PETRA '14.

[65]  Matteo Munaro,et al.  OpenPTrack: Open source multi-camera calibration and people tracking for RGB-D camera networks , 2016, Robotics Auton. Syst..

[66]  Qi Wang,et al.  Multi-cue based tracking , 2014, Neurocomputing.

[67]  Luigi Cinque,et al.  Multisubjects Tracking by Time-of-Flight Camera , 2013, ICIAP.

[68]  Silvio Savarese,et al.  A General Framework for Tracking Multiple People from a Moving Camera , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Armin B. Cremers,et al.  Adaptive Multi-cue 3D Tracking of Arbitrary Objects , 2012, DAGM/OAGM Symposium.

[70]  Xiaogang Wang,et al.  Intelligent multi-camera video surveillance: A review , 2013, Pattern Recognit. Lett..

[71]  Luis Salgado,et al.  Background foreground segmentation with RGB-D Kinect data: An efficient combination of classifiers , 2014, J. Vis. Commun. Image Represent..

[72]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[73]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 2000, International Journal of Computer Vision.

[74]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[75]  Francisco José Madrid-Cuevas,et al.  People detection and tracking with multiple stereo cameras using particle filters , 2009, J. Vis. Commun. Image Represent..

[76]  Shane Brennan,et al.  A Fast Stereo-based System for Detecting and Tracking Pedestrians from a Moving Vehicle , 2009, Int. J. Robotics Res..

[77]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Ye Liu,et al.  An ultra-fast human detection method for color-depth camera , 2015, J. Vis. Commun. Image Represent..

[79]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[80]  Matteo Munaro,et al.  OpenPTrack: People Tracking for Heterogeneous Networks of Color-Depth Cameras , 2014 .

[81]  Bernt Schiele,et al.  Disparity Statistics for Pedestrian Detection: Combining Appearance, Motion and Stereo , 2010, ECCV.

[82]  Antonis A. Argyros,et al.  Multicamera tracking of multiple humans based on colored visual hulls , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[83]  Pascal Fua,et al.  Probability occupancy maps for occluded depth images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Hong Wei,et al.  A survey of human motion analysis using depth imagery , 2013, Pattern Recognit. Lett..

[85]  Kai Oliver Arras,et al.  People tracking in RGB-D data with on-line boosted target models , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[86]  Robert C. Bolles,et al.  Background modeling for segmentation of video-rate stereo sequences , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[87]  Antonis A. Argyros,et al.  Multicamera human detection and tracking supporting natural interaction with large-scale displays , 2012, Machine Vision and Applications.

[88]  Silvio Savarese,et al.  Detecting and tracking people using an RGB-D camera via multiple detector fusion , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[89]  Andreas Zell,et al.  Real time person detection and tracking by mobile robots using RGB-D images , 2014, 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014).

[90]  Bingbing Ni,et al.  Crowded Scene Analysis: A Survey , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[91]  Sung-Jea Ko,et al.  Robust people counting system based on sensor fusion , 2012, IEEE Transactions on Consumer Electronics.

[92]  Manolis I. A. Lourakis,et al.  Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera , 2004, ECCV.

[93]  Ye Liu,et al.  Detecting and tracking people in real time with RGB-D camera , 2015, Pattern Recognit. Lett..

[94]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[95]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96]  Matteo Munaro,et al.  Fast RGB-D people tracking for service robots , 2014, Auton. Robots.

[97]  Daniele Nardi,et al.  Real-time people localization and tracking through fixed stereo vision , 2005, Applied Intelligence.

[98]  Jun Miura,et al.  Visual Person Identification Using a Distance-dependent Appearance Model for a Person Following Robot , 2012, Int. J. Autom. Comput..

[99]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.