Understanding Video Events: A Survey of Methods for Automatic Interpretation of Semantic Occurrences in Video

Understanding video events, i.e., the translation of low-level content in video sequences into high-level semantic concepts, is a research topic that has received much interest in recent years. Important applications of this paper include smart surveillance systems, semantic video database indexing, and interactive systems. This technology can be applied to several video domains including airport terminal, parking lot, traffic, subway stations, aerial surveillance, and sign language data. In this paper, we identify the two main components of the event understanding process: abstraction and event modeling. Abstraction is the process of molding the data into informative units to be used as input to the event model. Due to space restrictions, we will limit the discussion on the topic of abstraction. See the study by Lavee et al. (Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in video, Technion-Israel Inst. Technol., Haifa, Israel, Tech. Rep. CIS-2009-06, 2009) for a more complete discussion. Event modeling is devoted to describing events of interest formally and enabling recognition of these events as they occur in the video sequence. Event modeling can be further decomposed in the categories of pattern-recognition methods, state event models, and semantic event models. In this survey, we discuss this proposed taxonomy of the literature, offer a unifying terminology, and discuss popular event modeling formalisms (e.g., hidden Markov model) and their use in video event understanding using extensive examples from the literature. Finally, we consider the application domain of video event understanding in light of the proposed taxonomy, and propose future directions for research in this field.

[1]  Anup Basu,et al.  Human Activity Recognition Based on Silhouette Directionality , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  François Brémond,et al.  An APRIORI-based Method for Frequent Composite Event Discovery in Videos , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[3]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[4]  Andrea Cavallaro,et al.  Single camera calibration for trajectory-based behavior analysis , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[5]  Václav Hlavác,et al.  Pose primitive based human action recognition in videos or still images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Yoichi Sato,et al.  Recovering the Basic Structure of Human Activities from a Video-Based Symbol String , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[7]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[8]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[9]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[10]  Hans-Hellmut Nagel,et al.  From image sequences towards conceptual descriptions , 1988, Image Vis. Comput..

[11]  Lihi Zelnik-Manor,et al.  Statistical analysis of dynamic actions , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Yoram Singer,et al.  The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.

[14]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  S. Seshu,et al.  Introduction to the theory of finite-state machines , 1963 .

[16]  Aaron F. Bobick,et al.  A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[17]  Malik Ghallab,et al.  On Chronicles: Representation, On-line Recognition and Learning , 1996, KR.

[18]  François Brémond,et al.  Video-understanding framework for automatic behavior recognition , 2006, Behavior research methods.

[19]  Rama Chellappa,et al.  A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video* , 2008, IEEE Transactions on Multimedia.

[20]  Shaogang Gong,et al.  Learning pixel-wise signal energy for understanding semantics , 2003, Image Vis. Comput..

[21]  Irfan A. Essa,et al.  Learning Temporal Sequence Model from Partially Labeled Data , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Thomas S. Huang,et al.  Gesture modeling and recognition using finite state machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[23]  Richard Souvenir,et al.  Learning the viewpoint manifold for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ramesh C. Jain,et al.  Recursive identification of gesture inputs using hidden Markov models , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[25]  Andrew Zisserman,et al.  IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1989, 4-8 June, 1989, San Diego, CA, USA , 1989, CVPR.

[26]  Yiannis Aloimonos,et al.  View-Invariant Modeling and Recognition of Human Actions Using Grammars , 2006, WDV.

[27]  David J. Kriegman,et al.  Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  François Brémond,et al.  Monitoring Activities of Daily Living (ADLs) of Elderly Based on 3D Key Human Postures , 2009, ICVW.

[29]  Gian Luca Foresti,et al.  Anomalous trajectory detection using support vector machines , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[30]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[32]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[33]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[34]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Ramakant Nevatia,et al.  Coupled Hidden Semi Markov Models for Activity Recognition , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[36]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Nagia M. Ghanem,et al.  Petri Net Models for Event Recognition in Surveillance Videos , 2007 .

[38]  Christopher Town Ontology-Driven Bayesian Networks for Dynamic Scene Understanding , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[39]  Bernd Neumann,et al.  On scene interpretation with description logics , 2006, Image Vis. Comput..

[40]  Larry S. Davis,et al.  Representation and Recognition of Events in Surveillance Video Using Petri Nets , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[41]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[42]  François Brémond,et al.  Scene Understanding: perception, multi-sensor fusion, spatio-temporal reasoning and activity recognition. (Interprétation de Scènes : perception, fusion multi-capteurs, raisonnement spatio-temporel et reconnaissance d'activités) , 2007 .

[43]  S. Khalid,et al.  Classifying spatiotemporal object trajectories using unsupervised learning of basis function coefficients , 2005, VSSN@MM.

[44]  Anil G. Jegga,et al.  Identifying Functional Binding Motifs of Tumor Protein p53 Using Support Vector Machines , 2007, ICMLA 2007.

[45]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Aaron F. Bobick,et al.  Recognition and interpretation of parametric gesture , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[47]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[48]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[49]  R.P. Higgins,et al.  Automatic event recognition for enhanced situational awareness in UAV video , 2005, MILCOM 2005 - 2005 IEEE Military Communications Conference.

[50]  Kyungeun Cho,et al.  Inferring Stochastic Regular Grammar with Nearness Information for Human Action Recognition , 2006, ICIAR.

[51]  Murray Shanahan,et al.  An abductive event calculus planner , 2000, J. Log. Program..

[52]  Irfan A. Essa,et al.  Expectation grammars: leveraging high-level expectations for activity recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[53]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[54]  Matthew Brand,et al.  Understanding manipulation in video , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[55]  Svetha Venkatesh,et al.  Activity recognition and abnormality detection with the switching hidden semi-Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[56]  Jake K. Aggarwal,et al.  Video Retrieval of Human Interactions Using Model-Based Motion Tracking and Multi-layer Finite State Automata , 2003, CIVR.

[57]  Ehud Rivlin,et al.  Surveillance Event Interpretation Using Generalized Stochastic Petri Nets , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[58]  C. Petri Kommunikation mit Automaten , 1962 .

[59]  Pedro Ribeiro,et al.  Human Activity Recognition from Video: modeling, feature selection and classification architecture , 2005 .

[60]  Heiko Hecht,et al.  The Failings of Three Event Perception Theories , 2000 .

[61]  Martin Schierle,et al.  Bootstrapping algorithms for an application in the automotive domain , 2007, ICMLA 2007.

[62]  J. Santos-Victor,et al.  Detecting Luggage Related Behaviors Using a New Temporal Boost Algorithm ∗ , 2007 .

[63]  Patrick Pérez,et al.  Retrieving actions in movies , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[64]  Kang-Hyun Jo,et al.  Manipulative hand gesture recognition using task knowledge for human computer interaction , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[65]  Frank Dellaert,et al.  Grammatical Methods in Computer Vision: An Overview , 2004 .

[66]  Rama Chellappa,et al.  Epitomic Representation of Human Activities , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[68]  Kyungeun Cho,et al.  Human Action Recognition by Inference of Stochastic Regular Grammars , 2004, SSPR/SPR.

[69]  C. Micheloni,et al.  Kernel-based unsupervised trajectory clusters discovery , 2008 .

[70]  Jianbo Shi,et al.  Detecting unusual activity in video , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[71]  François Brémond,et al.  Video understanding for complex activity recognition , 2006, Machine Vision and Applications.

[72]  Monique Thonnat,et al.  Extraction of activity patterns on large video recordings , 2008 .

[73]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[74]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[75]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[76]  Monique Thonnat,et al.  Realtime image sequence interpretation for video-surveillance applications , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[77]  David A. Forsyth,et al.  Searching Video for Complex Activities with Finite State Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Martial Hebert,et al.  Event Detection in Crowded Videos , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[79]  M. Thonnat,et al.  Video sequence interpretation for visual surveillance , 2000, Proceedings Third IEEE International Workshop on Visual Surveillance.

[80]  Ramakant Nevatia,et al.  Video-based event recognition: activity representation and probabilistic recognition methods , 2004, Comput. Vis. Image Underst..

[81]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[82]  Marco Ajmone Marsan,et al.  Modelling with Generalized Stochastic Petri Nets , 1995, PERV.

[83]  Plinio Moreno,et al.  Boosting with Temporal Consistent Learners: An Application to Human Activity Recognition , 2007, ISVC.

[84]  Andreas Savvides,et al.  A sensory grammar for inferring behaviors in sensor networks , 2006, IPSN.

[85]  Ze-Nian Li,et al.  Successive Convex Matching for Action Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[86]  Osama Masoud,et al.  Online motion classification using support vector machines , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[87]  Matthew Brand,et al.  Discovery and Segmentation of Activities in Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  Dan Schonfeld,et al.  Object Trajectory-Based Activity Classification and Recognition Using Hidden Markov Models , 2007, IEEE Transactions on Image Processing.

[89]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[90]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[91]  Mubarak Shah,et al.  Visual gesture recognition , 1994 .

[92]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Ehud Rivlin,et al.  Building Petri Nets from Video Event Ontologies , 2007, ISVC.

[94]  David Minnen,et al.  Recognizing Soldier Activities in the Field , 2007, BSN.

[95]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[96]  Taisuke Sato,et al.  Bayesian classification of task-oriented actions based on stochastic context-free grammar , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[97]  Yifan Shi,et al.  P-Net: A Representation for Partially-Sequenced, Multi-stream Activity , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[98]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[99]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[100]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[101]  François Brémond,et al.  Automatic Video Interpretation: A Recognition Algorithm for Temporal Scenarios Based on Pre-compiled Scenario Models , 2003, ICVS.

[102]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[103]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[104]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[105]  Yiannis Aloimonos,et al.  Learning Parallel Grammar Systems for a Human Activity Language , 2006 .

[106]  Osama Masoud,et al.  A method for human action recognition , 2003, Image Vis. Comput..

[107]  Hilary Buxton,et al.  Comparison of Feedforward (TDRBF) and Generative (TDRGBN) Network for Gesture Based Control , 2001, Gesture Workshop.

[108]  Eric Horvitz,et al.  A Comparison of HMMs and Dynamic Bayesian Networks for Recognizing Office Activities , 2005, User Modeling.

[109]  Monique Thonnat,et al.  Recurrent Bayesian Network for the Recognition of Human Behaviors from Video , 2003, ICVS.

[110]  Fatih Murat Porikli,et al.  Learning object trajectory patterns by spectral clustering , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[111]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[112]  Gian Luca Foresti,et al.  Trajectory clustering and its applications for video surveillance , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[113]  Fengjun Lv,et al.  Left-Luggage Detection using Bayesian Inference , 2006 .

[114]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[115]  Irfan A. Essa,et al.  Recognizing multitasked activities from video using stochastic context-free grammar , 2002, AAAI/IAAI.

[116]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[117]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[118]  Rama Chellappa,et al.  Attribute Grammar-Based Event Recognition and Anomaly Detection , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[119]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[120]  Anthony Stefanidis,et al.  Modeling and comparing spatiotemporal events , 2004 .

[121]  Y. Aloimonos,et al.  View invariant identification of pose sequences for action recognition , 2004 .

[122]  Matthew Brand,et al.  The "Inverse Hollywood Problem": From Video to Scripts and Storyboards via Causal Analysis , 1997, AAAI/IAAI.

[123]  Jeffrey Mark Siskind,et al.  Visual Event Classification via Force Dynamics , 2000, AAAI/IAAI.

[124]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[125]  Donald E. Knuth,et al.  Semantics of context-free languages , 1968, Mathematical systems theory.

[126]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[127]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[128]  Jean-Philippe Thiran,et al.  Counting Pedestrians in Video Sequences Using Trajectory Clustering , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[129]  Mubarak Shah,et al.  Recognizing human actions using multiple features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[130]  Bernd Neumann,et al.  Division of Work During Behaviour Recognition - The SCENIC Approach , 2007, BMI.

[131]  Irfan Essa,et al.  Recognizing Multitasked Activities using Stochastic Context-Free Grammar , 2001 .

[132]  Dong Xu,et al.  Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[133]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[134]  Monique Thonnat,et al.  Activity Recognition from Video Sequences using Declarative Models , 2000, ECAI.

[135]  David C. Hogg,et al.  Learning Variable-Length Markov Models of Behavior , 2001, Comput. Vis. Image Underst..

[136]  Tao Wang,et al.  Semantic Event Detection using Conditional Random Fields , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[137]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[138]  Murray Shanahan,et al.  Representing Continuous Change in the Event Calculus , 1990, ECAI.

[139]  Alessandro Verri,et al.  Representing and recognizing visual dynamic events with support vector machines , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[140]  Yunqian Ma,et al.  Activity Recognition using Dynamic Bayesian Networks with Automatic State Selection , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[141]  Jake K. Aggarwal,et al.  Recognition of Composite Human Activities through Context-Free Grammar Based Representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[142]  Ramakant Nevatia,et al.  Large-scale event detection using semi-hidden Markov models , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[143]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[144]  Anthony G. Cohn,et al.  Modeling Interaction Using Learnt Qualitative Spatio-Temporal Relations and Variable Length Markov Models , 2002, ECAI.

[145]  Ramakant Nevatia,et al.  Hierarchical Multi-channel Hidden Semi Markov Models , 2007, IJCAI.

[146]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[147]  Sebastian Nowozin,et al.  Discriminative Subsequence Mining for Action Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[148]  Yihong Gong,et al.  Latent Pose Estimator for Continuous Action Recognition , 2008, ECCV.

[149]  Tae-Kyun Kim,et al.  Tensor Canonical Correlation Analysis for Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[150]  Anthony G. Cohn,et al.  Towards an Architecture for Cognitive Vision Using Qualitative Spatio-temporal Representations and Abduction , 2003, Spatial Cognition.

[151]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[152]  Mubarak Shah,et al.  TemporalBoost for event recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[153]  Tieniu Tan,et al.  Multi-thread Parsing for Recognizing Complex Events in Videos , 2008, ECCV.

[154]  Francois Bremond,et al.  Temporal Constraints for Video Interpretation , 2002 .

[155]  Shaogang Gong,et al.  Visual Surveillance in a Dynamic and Uncertain World , 1995, Artif. Intell..

[156]  Roman Goldenberg,et al.  Behavior classification by eigendecomposition of periodic motions , 2005, Pattern Recognit..

[157]  Tae-Kyun Kim,et al.  Learning Motion Categories using both Semantic and Structural Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[158]  Bernd Neumann,et al.  Learning a knowledge base of ontological concepts for high-level scene interpretation , 2007, ICMLA 2007.

[159]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[160]  Hyung Lee-Kwang,et al.  Modeling and recognition of hand gesture using colored Petri nets , 1999, IEEE Trans. Syst. Man Cybern. Part A.

[161]  Tieniu Tan,et al.  Complex Activity Representation and Recognition by Extended Stochastic Grammar , 2006, ACCV.

[162]  Tieniu Tan,et al.  Multi-agent visual surveillance of dynamic scenes , 1998, Image Vis. Comput..

[163]  Gian Luca Foresti,et al.  On-line trajectory clustering for anomalous events detection , 2006, Pattern Recognit. Lett..

[164]  Hilary Buxton,et al.  Conceptual descriptions from monitoring and watching image sequences , 2000, Image Vis. Comput..

[165]  S. Gong,et al.  Scene event recognition without tracking , 2003 .

[166]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[167]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[168]  Peter J. Haas,et al.  Stochastic Petri Nets: Modelling, Stability, Simulation , 2002 .

[169]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[170]  Shaogang Gong,et al.  Recognition of group activities using dynamic probabilistic networks , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[171]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[172]  Eric Horvitz,et al.  Layered representations for human activity recognition , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[173]  Nipun Kwatra,et al.  A Framework for Activity Recognition and Detection of Unusual Activities , 2004, ICVGIP.

[174]  Shaogang Gong,et al.  On the Visual Expectations of Moving Objects , 1992, ECAI.

[175]  Ramakant Nevatia,et al.  View and scale invariant action recognition using multiview shape-flow models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[176]  Deb Roy,et al.  Mining temporal patterns of movement for video content classification , 2006, MIR '06.

[177]  Van-Thinh Vu,et al.  Temporal scenario for automatic video interpretation , 2004 .

[178]  Malik Ghallab,et al.  Situation Recognition: Representation and Algorithms , 1993, IJCAI.

[179]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[180]  Larry S. Davis,et al.  Event Modeling and Recognition Using Markov Logic Networks , 2008, ECCV.

[181]  Ram Nevatia,et al.  Automatic Tracking and Labeling of Human Activities in a Video Sequence , 2004 .

[182]  Larry S. Davis,et al.  Mining tools for surveillance video , 2003, IS&T/SPIE Electronic Imaging.

[183]  Larry S. Davis,et al.  VidMAP: video monitoring of activity with Prolog , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[184]  Claudio S. Pinhanez,et al.  Human action detection using PNF propagation of temporal constraints , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[185]  Larry S. Davis,et al.  Multivalued Default Logic for Identity Maintenance in Visual Surveillance , 2006, ECCV.

[186]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[187]  David C. Minnen,et al.  Propagation networks for recognition of partially ordered sequential action , 2004, CVPR 2004.

[188]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[189]  Cristian Sminchisescu,et al.  Conditional Random Fields for Contextual Human Motion Recognition , 2005, ICCV.

[190]  A F Bobick,et al.  Movement, activity and action: the role of knowledge in the perception of motion. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[191]  Peter J. Haas,et al.  Stochastic Petri Nets , 2002 .