Tracking Humans for the Evaluation of their Motion in Image Sequences

The task of detecting people entering a sterile zone is a common scenario for visual surveillance systems. We propose a novel texture classifier to detect a person in a video frame without temporal information in realtime by identifying salient texture regions. An extension to this classifier by fusing it with simple motion information significantly outperforms standard motion tracking. Lower detection time can be achieved by combining texture classification with Kalman filtering. F1 measures are given for the i-LIDS sterile zone dataset of the UK Home Office. The fusion approach running on 10 frames per second gives the highest result of F1=0.92 for the 24 hour test dataset.

[1]  C. Hartshorne,et al.  Collected Papers of Charles Sanders Peirce , 1935, Nature.

[2]  C. Peirce,et al.  Collected Papers of Charles Sanders Peirce , 1936, Nature.

[3]  P. Schönemann,et al.  Fitting one matrix to another under choice of a central dilation and a rigid motion , 1970 .

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[6]  Michael C. Horsch,et al.  Dynamic Bayesian networks , 1990 .

[7]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[8]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[9]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[11]  Kenichi Kanatani,et al.  Geometric computation for machine vision , 1993 .

[12]  Andreas Stolcke,et al.  Best-first Model Merging for Hidden Markov Model Induction , 1994, ArXiv.

[13]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[14]  David C. Hogg,et al.  Learning the Distribution of Object Trajectories for Event Recognition , 1995, BMVC.

[15]  Emilio Bizzi,et al.  Modular organization of motor behavior in the frog's spinal cord , 1995, Trends in Neurosciences.

[16]  Anthony G. Cohn,et al.  Generation of Semantic Regions from Image Sequences , 1996, ECCV.

[17]  A F Bobick,et al.  Movement, activity and action: the role of knowledge in the perception of motion. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[18]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[21]  William J. Christmas Spatial Filtering Requirements for Gradient-Based Optical Flow Measurement , 1998, BMVC.

[22]  Gregory D. Hager,et al.  Efficient Region Tracking With Parametric Models of Geometry and Illumination , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Ralph R. Martin,et al.  Incremental Eigenanalysis for Classification , 1998, BMVC.

[24]  Kentaro Toyama,et al.  “Look, Ma – No Hands!” Hands-Free Cursor Control with Real-Time 3D Face Tracking , 1998 .

[25]  Robert B. Fisher,et al.  Construction of Articulated Models from Range Data , 1999, BMVC.

[26]  Kuniaki Uehara,et al.  Extraction of Primitive Motion for Human Motion Recognition , 1999, Discovery Science.

[27]  K. Nishimura,et al.  A gesture description model based on synthesizing fundamental gestures , 1999, Proceedings IEEE Southeastcon'99. Technology on the Brink of 2000 (Cat. No.99CH36300).

[28]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[29]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[30]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[31]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[32]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Sergei Nirenburg,et al.  Book Review: Ontological Semantics, by Sergei Nirenburg and Victor Raskin , 2004, CL.

[35]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Gary R. Bradski,et al.  Motion segmentation and pose recognition with motion history gradients , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[37]  Ali N. Akansu,et al.  Low-level motion activity features for semantic characterization of video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[38]  Norman I. Badler,et al.  To gesture or not to gesture: what is the question? , 2000, Proceedings Computer Graphics International 2000.

[39]  Thomas S. Huang,et al.  Gesture modeling and recognition using finite state machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[40]  Mubarak Shah,et al.  Monitoring human behavior from video taken in an office environment , 2001, Image Vis. Comput..

[41]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[42]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[44]  Nanning Zheng,et al.  Unsupervised Analysis of Human Gestures , 2001, IEEE Pacific Rim Conference on Multimedia.

[45]  Maja J. Mataric,et al.  Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[46]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[47]  Paul Blenkhorn,et al.  Blink detection for real-time eye tracking , 2002, J. Netw. Comput. Appl..

[48]  Tieniu Tan,et al.  Brief review of invariant texture analysis methods , 2002, Pattern Recognit..

[49]  Gary R. Bradski,et al.  Motion segmentation and pose recognition with motion history gradients , 2002, Machine Vision and Applications.

[50]  J. Kearney,et al.  Robust and Efficient Computation of the Closest Point on a Spline Curve , 2002 .

[51]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[52]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[53]  Shaogang Gong,et al.  Autonomous Visual Events Detection and Classification without Explicit Object-Centred Segmentation and Tracking , 2002, BMVC.

[54]  Gerhard Roth,et al.  Nouse 'Use Your Nose as a Mouse' - a New Technology for Hands-free Games and Interfaces , 2002 .

[55]  Azriel Rosenfeld,et al.  A method of detecting and tracking irises and eyelids in video , 2002, Pattern Recognit..

[56]  Dmitry O. Gorodnichy Nouse ‘Use Your Nose as a Mouse’ – a New Technology for Hands-free Games and Interfaces , 2002 .

[57]  M. Betke,et al.  The Camera Mouse: visual tracking of body features to provide computer access for people with severe disabilities , 2002, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[58]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[59]  Tim J. Ellis,et al.  Path detection in video surveillance , 2002, Image Vis. Comput..

[60]  M. Akca GENERALIZED PROCRUSTES ANALYSIS AND ITS APPLICATIONS IN PHOTOGRAMMETRY , 2003 .

[61]  Shaogang Gong,et al.  Discovering Bayesian causality among visual events in a complex outdoor scene , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[62]  Hilary Buxton,et al.  Learning and understanding dynamic scene activity: a review , 2003, Image Vis. Comput..

[63]  Timothy F. Cootes,et al.  Building optimal 2D statistical shape models , 2003, Image Vis. Comput..

[64]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[65]  Sethuraman Panchanathan,et al.  Gesture segmentation in complex motion sequences , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[66]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[67]  Sung Yong Shin,et al.  Rhythmic-motion synthesis based on motion-beat analysis , 2003, ACM Trans. Graph..

[68]  Michael J. Black,et al.  Robust parameterized component analysis: theory and applications to 2D facial appearance models , 2003, Comput. Vis. Image Underst..

[69]  Mubarak Shah,et al.  Machine Vision and Applications Understanding Human Behavior from Motion Imagery , 2003 .

[70]  Rita Cucchiara,et al.  Detecting Moving Objects, Ghosts, and Shadows in Video Streams , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  E. Vatikiotis-Bateson,et al.  `Putting the Face to the Voice' Matching Identity across Modality , 2003, Current Biology.

[73]  Atsushi Nakazawa,et al.  Detecting dance motion structure through music analysis , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[74]  Timothy F. Cootes,et al.  Diffeomorphic Statistical Shape Models , 2008, BMVC.

[75]  Wayne H. Wolf,et al.  A real-time background subtraction method with camera motion compensation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[76]  Jernej Barbic,et al.  Segmenting Motion Capture Data into Distinct Behaviors , 2004, Graphics Interface.

[77]  Timothy F. Cootes,et al.  Groupwise Diffeomorphic Non-rigid Registration for Automatic Model Building , 2004, ECCV.

[78]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[79]  Jake K. Aggarwal,et al.  Event semantics in two-person interactions , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[80]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[81]  Hans-Hellmut Nagel,et al.  Steps toward a Cognitive Vision System , 2004, AI Mag..

[82]  Arnold W. M. Smeulders,et al.  Fast occluded object tracking by a robust appearance filter , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Dimitrios Gunopulos,et al.  Subspace Clustering of High Dimensional Data , 2004, SDM.

[84]  Irfan A. Essa,et al.  Novel Skeletal Representation for Articulated Creatures , 2004, ECCV.

[85]  Yoshihiko Nakamura,et al.  Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..

[86]  Rita Cucchiara,et al.  Probabilistic people tracking for occlusion handling , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[87]  Ivan Laptev,et al.  Local Descriptors for Spatio-temporal Recognition , 2004, SCVMA.

[88]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[89]  Sebastian Thrun,et al.  Recovering Articulated Object Models from 3D Range Data , 2004, UAI.

[90]  Sethuraman Panchanathan,et al.  Automated gesture segmentation from dance sequences , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[91]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[92]  Ramakant Nevatia,et al.  Tracking multiple humans in complex situations , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[93]  Kunio Fukunaga,et al.  Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.

[94]  Alessandro Verri,et al.  Learning to Recognize Visual Dynamic Events from Examples , 2000, International Journal of Computer Vision.

[95]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[96]  Jeff G. Schneider,et al.  Automatic construction of active appearance models as an image coding problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[97]  Larry S. Davis,et al.  VidMAP: video monitoring of activity with Prolog , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[98]  Tim J. Ellis,et al.  Learning semantic scene models from observing activity in visual surveillance , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[99]  Fei Wang,et al.  Spectral Clustering for Time Series , 2005, ICAPR.

[100]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[101]  Ian D. Reid,et al.  Behaviour understanding in video: a combined method , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[102]  Gian Luca Foresti,et al.  Active Video-Based Surveillance System , 2005 .

[103]  Erwan Guillou,et al.  Human model and pose Reconstruction from Multi-views , 2005 .

[104]  Margrit Betke,et al.  Real Time Eye Tracking and Blink Detection with USB Cameras , 2005 .

[105]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..

[106]  E.G. Little,et al.  Ontology meta-model for building a situational picture of catastrophic events , 2005, 2005 7th International Conference on Information Fusion.

[107]  Rüdiger Dillmann,et al.  Modeling joint constraints for an articulated 3D human body model with artificial correspondences in ICP , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[108]  Timothy F. Cootes,et al.  Groupwise Construction of Appearance Models using Piece-wise Affine Deformations , 2005, BMVC.

[109]  Y. Aloimonos,et al.  Discovering a Language for Human Activity 1 , 2005 .

[110]  Sethuraman Panchanathan,et al.  Documenting motion sequences with a personalized annotation system , 2006, IEEE Multimedia.

[111]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[112]  Fadi Dornaika,et al.  Fitting 3D face models for tracking and active appearance model training , 2006, Image Vis. Comput..

[113]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[114]  Ian D. Reid,et al.  Unconstrained Multiple-People Tracking , 2006, DAGM-Symposium.

[115]  Bruno Raffin,et al.  3D Skeleton-Based Body Pose Recovery , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[116]  Christophe G. Giraud-Carrier,et al.  Learning the Threshold in Hierarchical Agglomerative Clustering , 2006, 2006 5th International Conference on Machine Learning and Applications (ICMLA'06).

[117]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[118]  Philip H. S. Torr,et al.  Regression-Based Human Motion Capture From Voxel Data , 2006, BMVC.

[119]  Yoshihiko Nakamura,et al.  Humanoid Robot's Autonomous Acquisition of Proto-Symbols through Motion Segmentation , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[120]  Sharath Pankanti,et al.  Appearance models for occlusion handling , 2006, Image Vis. Comput..

[121]  Monique Thonnat,et al.  Audio-Video Event Recognition System for Public Transport Security , 2006 .

[122]  Rainer Lienhart,et al.  Using CART to segment road images , 2006, Electronic Imaging.

[123]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Yoshihiko Nakamura,et al.  Segmentation, Memorization, Recognition and Abstraction of Humanoid Motions Based on Correlations and Associative Memory , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[125]  Yiannis Aloimonos,et al.  A Sensory-Motor Language for Human Activity Understanding , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[126]  Alex Pentland,et al.  Human computing and machine understanding of human behavior: a survey , 2006, ICMI '06.

[127]  Mark E Hallenbeck,et al.  Extracting Roadway Background Image , 2006 .

[128]  Ulrike Sattler,et al.  A Case for Abductive Reasoning over Ontologies , 2006, OWLED.

[129]  Rick Kjeldsen,et al.  Improvements in vision-based pointer control , 2006, Assets '06.

[130]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[131]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[132]  M. Shah,et al.  Object tracking: A survey , 2006, CSUR.

[133]  Robert T. Collins,et al.  Moving Object Localization in Thermal Imagery by Forward-backward MHI , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[134]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[135]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[136]  Danica Kragic,et al.  Action recognition and understanding through motor primitives , 2007, Adv. Robotics.

[137]  C. Ding,et al.  Adaptive dimension reduction using discriminant analysis and K-means clustering , 2007, ICML '07.

[138]  R. Möller,et al.  Multimedia Interpretation as Abduction , 2007 .

[139]  Stefano Corazza,et al.  Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[140]  S. Ishikawa,et al.  Robust Human Motion Recognition Based on Multi-directional Motion Representation , 2007, Second International Conference on Innovative Computing, Informatio and Control (ICICIC 2007).

[141]  Simone Calderara,et al.  A Distributed Outdoor Video Surveillance System for Detection of Abnormal People Trajectories , 2007, 2007 First ACM/IEEE International Conference on Distributed Smart Cameras.

[142]  Pau Baiget,et al.  Natural Language Descriptions of Human Behavior from Video Sequences , 2007, KI.

[143]  Julio Abascal,et al.  Universal accessibility as a multimodal design issue , 2007, CACM.

[144]  Zoran Duric,et al.  Using Image Flow to Detect Eye Blinks in Color Videos , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[145]  Sergio A. Velastin,et al.  Markov models of periodically varying backgrounds for change detection , 2007 .

[146]  Cedric Nishan Canagarajah,et al.  Sequential Monte Carlo tracking by fusing multiple cues in video sequences , 2007, Image Vis. Comput..

[147]  Lin Sun,et al.  Eyeblink-based Anti-Spoofing in Face Recognition from a Generic Webcamera , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[148]  Gérard G. Medioni,et al.  Detecting Motion Regions in the Presence of a Strong Parallax from a Moving Camera by Multiview Geometric Constraints , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[149]  Iasonas Kokkinos,et al.  Unsupervised Learning of Object Deformation Models , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[150]  Steve Renals,et al.  Automatic Meeting Segmentation Using Dynamic Bayesian Networks , 2007, IEEE Transactions on Multimedia.

[151]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[152]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[153]  Konrad Rieck,et al.  Linear-Time Computation of Similarity Measures for Sequential Data , 2008, J. Mach. Learn. Res..

[154]  Shaogang Gong,et al.  Incremental and adaptive abnormal behaviour detection , 2008, Comput. Vis. Image Underst..

[155]  Lauro Snidaro,et al.  Fusion of heterogeneous features via cascaded on-line boosting , 2008, 2008 11th International Conference on Information Fusion.

[156]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[157]  Rita Cucchiara,et al.  ViSOR: VIdeo Surveillance On-line Repository for annotation retrieval , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[158]  Md. Atiqur Rahman Ahad,et al.  Complex Motion Separation and Recognition Using Directional Motion Templates , 2008, IWCIA Special Track on Applications.

[159]  Mário A. T. Figueiredo,et al.  Independent increment processes for human motion recognition , 2008, Comput. Vis. Image Underst..

[160]  S. Ishikawa,et al.  Human activity recognition: Various paradigms , 2008, 2008 International Conference on Control, Automation and Systems.

[161]  Shaogang Gong,et al.  Video Behavior Profiling for Anomaly Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[162]  Larry S. Davis,et al.  Action recognition using ballistic dynamics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[163]  Bernd Neumann,et al.  Ontology-Based Reasoning Techniques for Multimedia Interpretation and Retrieval , 2008 .

[164]  F. Xavier Roca,et al.  Real-time gaze tracking with appearance-based models , 2009, Machine Vision and Applications.

[165]  F. Xavier Roca,et al.  Understanding dynamic scenes based on human sequence evaluation , 2009, Image Vis. Comput..