Hierarchical Language-based Representation of Events in Video Streams

We aim to define an event ontology that allows natural representation of complex spatio-temporal events common in the physical world by a composition of simpler events. The events are abstracted into three hierarchies. Primitive events are defined directly from the mobile object properties. Single-thread composite events are a number of primitive events with temporal sequencing. Multi-thread composite events are a number of single-thread events with temporal/ spatial/logical relationships. This hierarchical event representation naturally leads to a language description of the events. We define an Event Recognition Language (ERL) which allows the users to define the events of interest conveniently without interacting with the low level processing in the program. We will also briefly mention some approaches to compute the proposed representation.

[1]  James F. Allen,et al.  Actions and Events in Interval Temporal Logic , 1994, J. Log. Comput..

[2]  James F. Allen,et al.  Actions and Events in Interval Temporal Logic , 1994 .

[3]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Gérard G. Medioni,et al.  Detecting and tracking moving objects for video surveillance , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[5]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Monique Thonnat,et al.  Activity Recognition from Video Sequences using Declarative Models , 2000, ECAI.

[8]  Ramakant Nevatia,et al.  Representation and optimal recognition of human activities , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Matthew Brand,et al.  Discovery and Segmentation of Activities in Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Mubarak Shah,et al.  Monitoring human behavior from video taken in an office environment , 2001, Image Vis. Comput..

[11]  Ramakant Nevatia,et al.  Segmentation and tracking of multiple humans in complex situations , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  Ramakant Nevatia,et al.  A unified bayesian and logical approach for video-based event recognition , 2003 .

[14]  Kunio Fukunaga,et al.  Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.