Activity Modeling Using Event Probability Sequences

Changes in motion properties of trajectories provide useful cues for modeling and recognizing human activities. We associate an event with significant changes that are localized in time and space, and represent activities as a sequence of such events. The localized nature of events allows for detection of subtle changes or anomalies in activities. In this paper, we present a probabilistic approach for representing events using the hidden Markov model (HMM) framework. Using trained HMMs for activities, an event probability sequence is computed for every motion trajectory in the training set. It reflects the probability of an event occurring at every time instant. Though the parameters of the trained HMMs depend on viewing direction, the event probability sequences are robust to changes in viewing direction. We describe sufficient conditions for the existence of view invariance. The usefulness of the proposed event representation is illustrated using activity recognition and anomaly detection. Experiments using the indoor University of Central Florida human action dataset, the Carnegie Mellon University Credo Intelligence, Inc., Motion Capture dataset, and the outdoor Transportation Security Administration airport tarmac surveillance dataset show encouraging results.

[1]  Yifan Shi,et al.  P-Net: A Representation for Partially-Sequenced, Multi-stream Activity , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Jianbo Shi,et al.  Detecting unusual activity in video , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Matthew Brand,et al.  Discovery and Segmentation of Activities in Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Dan Schonfeld,et al.  Object Trajectory-Based Activity Classification and Recognition Using Hidden Markov Models , 2007, IEEE Transactions on Image Processing.

[5]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Rama Chellappa,et al.  Towards a view invariant gait recognition algorithm , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[7]  Saburo Tsuji,et al.  Understanding a Simple Cartoon Film by a Computer Vision System , 1977, IJCAI.

[8]  Rama Chellappa,et al.  A Factorization Approach for Activity Recognition , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  Pietro Perona,et al.  Decomposition of human motion into dynamics-based primitives with application to drawing tasks , 2003, Autom..

[10]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[13]  Fatih Murat Porikli,et al.  Clustering Variable Length Sequences by Eigenvector Decomposition Using HMM , 2004, SSPR/SPR.

[14]  Dan Schonfeld,et al.  A hybrid system for affine-invariant trajectory retrieval , 2004, MIR '04.

[15]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[16]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[17]  Francois Bremond,et al.  Temporal Constraints for Video Interpretation , 2002 .

[18]  Daphne Koller,et al.  Sampling in Factored Dynamic Systems , 2001, Sequential Monte Carlo Methods in Practice.

[19]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Kunio Fukunaga,et al.  Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.

[21]  François Brémond,et al.  Automatic Video Interpretation: A Novel Algorithm for Temporal Scenario Recognition , 2003, IJCAI.

[22]  Rama Chellappa,et al.  Activity recognition using the dynamics of the configuration of interacting objects , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[23]  Yan Huang,et al.  ARGMode - Activity Recognition using Graphical Models , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[24]  Rama Chellappa,et al.  Computational Vision Approaches for Event Modeling , 2008 .

[25]  Tanveer F. Syeda-Mahmood Segmenting actions in velocity curve space , 2002, Object recognition supported by user interaction for service robots.

[26]  Eric Horvitz,et al.  Layered representations for human activity recognition , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[27]  Shih-Fu Chang,et al.  Motion trajectory matching of video objects , 1999, Electronic Imaging.

[28]  Rama Chellappa,et al.  Interpretation of state sequences in HMM for activity representation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[29]  James W. Davis,et al.  Real-time recognition of activity using temporal templates , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[30]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[31]  Rama Chellappa,et al.  View invariants for human action recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  W. Eric L. Grimson,et al.  Simultaneous Pose Estimation and Camera Calibration from Multiple Views , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[33]  B.-H. Juang,et al.  On the hidden Markov model and dynamic time warping for speech recognition — A unified view , 1984, AT&T Bell Laboratories Technical Journal.

[34]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[35]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[36]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[38]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[39]  I. Csiszár,et al.  The consistency of the BIC Markov order estimator , 2000 .

[40]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Rama Chellappa,et al.  Identification of humans using gait , 2004, IEEE Transactions on Image Processing.

[42]  Jianyong Wang,et al.  Mining Complex Time-Series Data by Learning Markovian Models , 2006, Sixth International Conference on Data Mining (ICDM'06).

[43]  Shaogang Gong,et al.  Video behaviour profiling and abnormality detection without manual labelling , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[44]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[45]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..

[46]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[47]  Abdelaziz Kriouile,et al.  Automatic word recognition based on second-order hidden Markov models , 1994, IEEE Trans. Speech Audio Process..