Human detection in surveillance videos and its applications - a review

Detecting human beings accurately in a visual surveillance system is crucial for diverse application areas including abnormal event detection, human gait characterization, congestion analysis, person identification, gender classification and fall detection for elderly people. The first step of the detection process is to detect an object which is in motion. Object detection could be performed using background subtraction, optical flow and spatio-temporal filtering techniques. Once detected, a moving object could be classified as a human being using shape-based, texture-based or motion-based features. A comprehensive review with comparisons on available techniques for detecting human beings in surveillance videos is presented in this paper. The characteristics of few benchmark datasets as well as the future research directions on human detection have also been discussed.

[1]  Yi-Ping Hung,et al.  Efficient hierarchical method for background subtraction , 2007, Pattern Recognit..

[2]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  KimKyungnam,et al.  Real-time foreground-background segmentation using codebook model , 2005 .

[4]  Angel D. Sappa,et al.  Adaptive Image Sampling and Windows Classification for On-board Pedestrian Detection , 2007 .

[5]  Chih-Wen Su,et al.  A vision-based people counting approach based on the symmetry measure , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[6]  Larry S. Davis,et al.  A Perturbation Method for Evaluating Background Subtraction Algorithms , 2003 .

[7]  D.-y. Chen,et al.  Orientation filter enhanced pedestrian detection , 2010 .

[8]  Yunhong Wang,et al.  Investigating the separability of features from different views for gait based gender classification , 2008, 2008 19th International Conference on Pattern Recognition.

[9]  Bu-Sung Lee,et al.  Video coding with dynamic background , 2013, EURASIP J. Adv. Signal Process..

[10]  Rita Cucchiara,et al.  Detecting Moving Objects, Ghosts, and Shadows in Video Streams , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Eli Saber,et al.  Frontal-view face detection and facial feature extraction using color, shape and symmetry based cost functions , 1998, Pattern Recognit. Lett..

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Cordelia Schmid,et al.  Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Rogério Schmidt Feris,et al.  Robust Detection of Abandoned and Removed Objects in Complex Surveillance Videos , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[15]  James J. Little,et al.  Learning to Track and Identify Players from Broadcast Sports Videos , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Azriel Rosenfeld,et al.  Detection and location of people in video images using adaptive fusion of color and edge information , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[18]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[19]  Dmitry B. Goldgof,et al.  How effective is human video surveillance performance? , 2008, 2008 19th International Conference on Pattern Recognition.

[20]  Jianwei Zhang,et al.  A Hierarchical Model Incorporating Segmented Regions and Pixel Descriptors for Video Background Subtraction , 2012, IEEE Transactions on Industrial Informatics.

[21]  Dariu Gavrila,et al.  A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Hiroshi Murase,et al.  Moving object recognition in eigenspace representation: gait analysis and lip reading , 1996, Pattern Recognit. Lett..

[26]  Yun Fu,et al.  Gender recognition from body , 2008, ACM Multimedia.

[27]  Mark S. Nixon,et al.  Extended Model-Based Automatic Gait Recognition of Walking and Running , 2001, AVBPA.

[28]  Ashok Samal,et al.  Automatic recognition and analysis of human faces and facial expressions: a survey , 1992, Pattern Recognit..

[29]  Farjana Z. Eishita,et al.  Occlusion Handling in Object Detection , 2012 .

[30]  Yunhong Wang,et al.  Gender Classification Based on Fusion of Multi-view Gait Sequences , 2007, ACCV.

[31]  Jiří Matas,et al.  Computer Vision - ECCV 2004 , 2004, Lecture Notes in Computer Science.

[32]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Duan-Yu Chen,et al.  Face-based gender recognition using compressive sensing , 2012, 2012 International Symposium on Intelligent Signal Processing and Communications Systems.

[34]  Larry S. Davis,et al.  Efficient Kernel Density Estimation Using the Fast Gauss Transform with Applications to Color Modeling and Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Bohyung Han,et al.  SEQUENTIAL KERNEL DENSITY APPROXIMATION THROUGH MODE PROPAGATION: APPLICATIONS TO BACKGROUND MODELING , 2004 .

[36]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[37]  Mark S. Nixon,et al.  Extracting a Human Gait Model for use as a Biometric , 1998 .

[38]  Roberta Piroddi,et al.  A simple framework for spatio-temporal video segmentation and delayering using dense motion fields , 2006, IEEE Signal Processing Letters.

[39]  Roman Szewczyk,et al.  Recent Advances in Mechatronics , 2007 .

[40]  Yunhong Wang,et al.  Gait-Based Gender Classification Using Mixed Conditional Random Field , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  Wonjun Kim,et al.  Background Subtraction for Dynamic Texture Scenes Using Fuzzy Color Histograms , 2012, IEEE Signal Processing Letters.

[42]  Honghai Liu,et al.  Advances in View-Invariant Human Motion Analysis: A Review , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[43]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[44]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[45]  Daniela Moctezuma,et al.  HoGG: Gabor and HoG-based human detection for surveillance in non-controlled environments , 2013, Neurocomputing.

[46]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Chandrika Kamath,et al.  Robust techniques for background subtraction in urban traffic video , 2004, IS&T/SPIE Electronic Imaging.

[48]  Aaron F. Bobick,et al.  Gait recognition from time-normalized joint-angle trajectories in the walking plane , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[49]  Daniela Moctezuma,et al.  Person detection in surveillance environment with HoGG: Gabor filters and Histogram of Oriented Gradient , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[50]  Larry S. Davis,et al.  Motion-based recognition of people in EigenGait space , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[51]  Adrian Hilton,et al.  Shape Similarity for 3D Video Sequences of People , 2010, International Journal of Computer Vision.

[52]  Mark S. Nixon,et al.  Dynamic feature extraction via the velocity Hough transform , 1997, Pattern Recognit. Lett..

[53]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[54]  Aaron F. Bobick,et al.  Action recognition using probabilistic parsing , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[55]  A. N. Rajagopalan,et al.  Gait-based recognition of humans using continuous HMMs , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[56]  Deborah Estrin,et al.  Warping background subtraction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[57]  Bernd Menser,et al.  Segmentation and tracking of facial regions in color image sequences , 2000, Visual Communications and Image Processing.

[58]  Chunping Liu,et al.  An improved warping background subtraction model for moving object detection , 2011, Proceedings 2011 International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE).

[59]  Alicia Ageno,et al.  Adaptive information extraction , 2006, CSUR.

[60]  Shiming Xiang,et al.  Real-time Object Classification in Video Surveillance Based on Appearance Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  Georgios Tziritas,et al.  Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis , 1999, IEEE Trans. Multim..

[62]  Yu-Chiang Frank Wang,et al.  Cross camera people counting with perspective estimation and occlusion handling , 2011, 2011 IEEE International Workshop on Information Forensics and Security.

[63]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[64]  Jen-Hui Chuang,et al.  Learning a Scene Background Model via Classification , 2009, IEEE Transactions on Signal Processing.

[65]  Manoranjan Paul,et al.  Video Coding Focusing on Block Partitioning and Occlusion , 2010, IEEE Transactions on Image Processing.

[66]  Kotagiri Ramamohanarao,et al.  Moving shape dynamics: A signal processing perspective , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Olaf Munkelt,et al.  Adaptive Background Estimation and Foreground Detection using Kalman-Filtering , 1995 .

[68]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[69]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[70]  Chiraz Ben Abdelkader Motion-Based Recognition of People in EigenGait Space , 2002 .

[71]  Ehud Rivlin,et al.  Understanding Video Events: A Survey of Methods for Automatic Interpretation of Semantic Occurrences in Video , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[72]  Larry S. Davis,et al.  EigenGait: Motion-Based Recognition of People Using Image Self-Similarity , 2001, AVBPA.

[73]  E. Adelson,et al.  Analyzing gait with spatiotemporal surfaces , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[74]  Xuelong Li,et al.  Gait Components and Their Application to Gender Recognition , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[75]  Duan-Yu Chen,et al.  Motion-based unusual event detection in human crowds , 2011, J. Vis. Commun. Image Represent..

[76]  Surendra Ranganath,et al.  Detecting people in dense crowds , 2010, Machine Vision and Applications.

[77]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[78]  Hedvig Kjellström,et al.  IEEE International Conference on Automatic Face and Gesture Recognition , 2013 .

[79]  Hong-Yuan Mark Liao,et al.  Visual knowledge transfer among multiple cameras for people counting with occlusion handling , 2012, ACM Multimedia.

[80]  Wei Zhang,et al.  Detection of moving cast shadows using image orthogonal transform , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[81]  Tieniu Tan,et al.  Gait recognition based on Procrustes shape analysis , 2002, Proceedings. International Conference on Image Processing.

[82]  Alex Pentland,et al.  Beyond eigenfaces: probabilistic matching for face recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[83]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[84]  Wen-Huang Cheng,et al.  Who's Who in a Sports Video? An Individual Level Sports Video Indexing System , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[85]  Mark S. Nixon,et al.  Using Gait as a Biometric, via Phase-weighted Magnitude Spectra , 1997, AVBPA.

[86]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[87]  Emanuele Lindo Secco,et al.  A Real-Time and Self-Calibrating Algorithm Based on Triaxial Accelerometer Signals for the Detection of Human Posture and Activity , 2010, IEEE Transactions on Information Technology in Biomedicine.

[88]  Larry S. Davis,et al.  Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  George Bebis,et al.  Robust Video-Based Surveillance by Integrating Target Detection with Tracking , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[90]  Manoranjan Paul,et al.  A hybrid object detection technique from dynamic background using Gaussian mixture models , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[91]  Chih-Yang Lin,et al.  Moving Objects Detection Based on Hysteresis Thresholding , 2013 .

[92]  James J. Little,et al.  Identifying players in broadcast sports videos using conditional random fields , 2011, CVPR 2011.

[93]  Guodong Guo,et al.  Face recognition by support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[94]  M. Nixon,et al.  Human gait recognition in canonical space using temporal templates , 1999 .

[95]  Duan-Yu Chen,et al.  Visual-Based Human Crowds Behavior Analysis Based on Graph Modeling and Matching , 2013, IEEE Sensors Journal.

[96]  Du-Ming Tsai,et al.  Independent Component Analysis-Based Background Subtraction for Indoor Surveillance , 2009, IEEE Transactions on Image Processing.

[97]  Nicolas Thome,et al.  A Real-Time, Multiview Fall Detection System: A LHMM-Based Approach , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[98]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[99]  Antti Oulasvirta,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[100]  Bu-Sung Lee,et al.  Explore and Model Better I-Frames for Video Coding , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[101]  Chih-Wen Su,et al.  An online people counting system for electronic advertising machines , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[102]  Han Songchen,et al.  Hierarchical CodeBook for background subtraction in MRF , 2013 .

[103]  Jianbo Shi,et al.  Detecting unusual activity in video , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[104]  Yael Moses,et al.  Homography based multiple camera detection and tracking of people in a dense crowd , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[105]  Jean-Michel Jolion,et al.  Pairwise Features for Human Action Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[106]  Dar-Shyang Lee,et al.  Effective Gaussian mixture learning for video background subtraction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[107]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  Edward H. Adelson,et al.  Analyzing and recognizing walking figures in XYT , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[109]  Shanq-Jang Ruan,et al.  Scene Analysis for Object Detection in Advanced Surveillance Systems Using Laplacian Distribution Model , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[110]  Sidney S. Fels,et al.  Evaluation of Background Subtraction Algorithms with Post-Processing , 2008, 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance.

[111]  Mubarak Shah,et al.  Recognizing human actions , 2005, VSSN@MM.

[112]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[113]  Luigi di Stefano,et al.  Background subtraction by non-parametric probabilistic clustering , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[114]  Tzuu-Hseng S. Li,et al.  Robust $H_{\infty}$ Fuzzy Control for a Class of Uncertain Discrete Fuzzy Bilinear Systems , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[115]  Manoranjan Paul,et al.  Disparity-adjusted 3D multi-view video coding with dynamic background modelling , 2013, 2013 IEEE International Conference on Image Processing.

[116]  Dmitry B. Goldgof,et al.  Understanding Transit Scenes: A Survey on Human Behavior-Recognition Algorithms , 2010, IEEE Transactions on Intelligent Transportation Systems.

[117]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[118]  Mark S. Nixon,et al.  Gait Extraction and Description by Evidence-Gathering , 1999 .

[119]  HeikkilaMarko,et al.  A Texture-Based Method for Modeling the Background and Detecting Moving Objects , 2006 .

[120]  Mark S. Nixon,et al.  Gait Recognition By Walking and Running: A Model-Based Approach , 2002 .

[121]  A. B. M. Shawkat Ali,et al.  Multidisciplinary Computational Intelligence Techniques: Applications in Business, Engineering, and Medicine , 2012 .

[122]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[123]  Huiru Zheng,et al.  A theoretic algorithm for fall and motionless detection , 2009, 2009 3rd International Conference on Pervasive Computing Technologies for Healthcare.

[124]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[125]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[126]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[127]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[128]  Anup Basu,et al.  Human Activity Recognition Based on Silhouette Directionality , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[129]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[130]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[131]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[132]  Joachim M. Buhmann,et al.  Topology Free Hidden Markov Models: Application to Background Modeling , 2001, ICCV.

[133]  Chandrika Kamath,et al.  Robust Background Subtraction with Foreground Validation for Urban Traffic Video , 2005, EURASIP J. Adv. Signal Process..

[134]  Rama Chellappa,et al.  Applications of a Simple Characterization of Human Gait in Surveillance , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[135]  Robert T. Collins,et al.  Silhouette-based human identification from body shape and gait , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[136]  Chijung Hwang,et al.  The Efficient Features for Tracking , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[137]  David J. Fleet,et al.  Performance of optical flow techniques , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[138]  Jean-Marc Odobez,et al.  Fast human detection from joint appearance and foreground feature subset covariances , 2011, Comput. Vis. Image Underst..

[139]  Josef Kittler,et al.  Audio- and Video-Based Biometric Person Authentication, 5th International Conference, AVBPA 2005, Hilton Rye Town, NY, USA, July 20-22, 2005, Proceedings , 2005, AVBPA.

[140]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[141]  Gang Xu,et al.  Rits Eye: a software-based system for real-time face detection and tracking using pan-tilt-zoom controllable camera , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[142]  Baharak Shakeri Aski,et al.  Intelligent video surveillance for monitoring fall detection of elderly in home environments , 2008, 2008 11th International Conference on Computer and Information Technology.

[143]  Atsushi Shimada,et al.  Dynamic Control of Adaptive Mixture-of-Gaussians Background Model , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[144]  Sabu Emmanuel,et al.  Intelligent Video Surveillance for Monitoring Elderly in Home Environments , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[145]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[146]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[147]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.

[148]  Jianbo Shi,et al.  Detecting unusual activity in video , 2004, CVPR 2004.