Probabilistic event logic for interval-based event recognition

This paper is about detecting and segmenting interrelated events which occur in challenging videos with motion blur, occlusions, dynamic backgrounds, and missing observations. We argue that holistic reasoning about time intervals of events, and their temporal constraints is critical in such domains to overcome the noise inherent to low-level video representations. For this purpose, our first contribution is the formulation of probabilistic event logic (PEL) for representing temporal constraints among events. A PEL knowledge base consists of confidence-weighted formulas from a temporal event logic, and specifies a joint distribution over the occurrence time intervals of all events. Our second contribution is a MAP inference algorithm for PEL that addresses the scalability issue of reasoning about an enormous number of time intervals and their constraints in a typical video. Specifically, our algorithm leverages the spanning-interval data structure for compactly representing and manipulating entire sets of time intervals without enumerating them. Our experiments on interpreting basketball videos show that PEL inference is able to jointly detect events and identify their time intervals, based on noisy input from primitive-event detectors.

[1]  Juan Carlos Niebles,et al.  Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.

[2]  Alan Fern,et al.  A PENALTY‐LOGIC SIMPLE‐TRANSITION MODEL FOR STRUCTURED SEQUENCES , 2009, Comput. Intell..

[3]  Larry S. Davis,et al.  Multivalued Default Logic for Identity Maintenance in Visual Surveillance , 2006, ECCV.

[4]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[6]  Michel Dhome,et al.  Real Time Robust Template Matching , 2002, BMVC.

[7]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  James F. Allen,et al.  Actions and Events in Interval Temporal Logic , 1994 .

[9]  Larry S. Davis,et al.  Event Modeling and Recognition Using Markov Logic Networks , 2008, ECCV.

[10]  Ramakant Nevatia,et al.  An Ontology for Video Event Representation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[11]  Robert Givan,et al.  Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video , 2002, J. Artif. Intell. Res..

[12]  Irfan A. Essa,et al.  Structure from Statistics - Unsupervised Activity Analysis using Suffix Trees , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[14]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[15]  Larry S. Davis,et al.  Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos , 2009, CVPR.

[16]  Jeffrey Mark Siskind,et al.  Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic , 1999, J. Artif. Intell. Res..

[17]  Jake K. Aggarwal,et al.  Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Robert Givan,et al.  Specific-to-general learning for temporal events , 2002, AAAI/IAAI.

[20]  Monique Thonnat,et al.  Activity Recognition from Video Sequences using Declarative Models , 2000, ECAI.

[21]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Larry S. Davis,et al.  Bilattice-based Logical Reasoning for Human Detection , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.