VERSA- Video event recognition for surveillance applications

VERSA provides a general-purpose framework for defining and recognizing events in live or recorded surveillance video streams. The approach for event recognition in VERSA is using a declarative logic language to define the spatial and temporal relationships that characterize a given event or activity. Doing so requires the definition of certain fundamental spatial and temporal relationships and a high-level syntax for specifying frame templates and query parameters. Although the handling of uncertainty in the current VERSA implementation is simplistic, the language and architecture is amenable to extending using Fuzzy Logic or similar approaches. VERSA’s high-level architecture is designed to work in XML-based, servicesoriented environments. VERSA can be thought of as subscribing to the XML annotations streamed by a lower-level video analytics service that provides basic entity detection, labeling, and tracking. One or many VERSA Event Monitors could thus analyze video streams and provide alerts when certain events are detected.

[1]  Mitsuru Ishizuka,et al.  Prolog-ELF incorporating fuzzy logic , 2009, New Generation Computing.

[2]  R. Nevatia,et al.  EDF: A framework for Semantic Annotation of Video , 2005, Tenth IEEE International Conference on Computer Vision Workshops (ICCVW'05).

[3]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..

[4]  Gary Marchionini,et al.  Open video: A framework for a test collection , 2000, J. Netw. Comput. Appl..

[5]  Aaron F. Bobick,et al.  Video surveillance of interactions , 1999, Proceedings Second IEEE Workshop on Visual Surveillance (VS'99) (Cat. No.98-89223).

[6]  Wei Niu,et al.  Human activity detection and recognition for video surveillance , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[7]  Liang-Tien Chia,et al.  Automatic Generation of MPEG-7 Compliant XML Document for Motion Trajectory Descriptor in Sports Video , 2003, MMDB '03.

[8]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Aaron F. Bobick,et al.  A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[10]  Robert B. Fisher,et al.  CVML - an XML-based computer vision markup language , 2004, ICPR 2004.

[11]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[12]  Stephen J. Maybank,et al.  Fusion of Multiple Tracking Algorithms for Robust People Tracking , 2002, ECCV.

[13]  Larry S. Davis,et al.  VidMAP: video monitoring of activity with Prolog , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[14]  W. Eric L. Grimson,et al.  Answering Questions about Moving Objects in Surveillance Videos , 2003, New Directions in Question Answering.

[15]  Youtian Du,et al.  Recognizing Interaction Activities using Dynamic Bayesian Network , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[16]  John R. Smith,et al.  VideoAL: a novel end-to-end MPEG-7 video automatic labeling system , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[17]  Shaogang Gong,et al.  Recognition of group activities using dynamic probabilistic networks , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Susana Muñoz-Hernández,et al.  Fuzzy Prolog: a new approach using soft constraints propagation , 2004, Fuzzy Sets Syst..

[19]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[21]  Tim J. Ellis,et al.  Spatial and Probabilistic Modelling of Pedestrian Behaviour , 2002, BMVC.

[22]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  François Brémond,et al.  Automatic Video Interpretation: A Recognition Algorithm for Temporal Scenarios Based on Pre-compiled Scenario Models , 2003, ICVS.

[24]  M. Mukaidono,et al.  Fuzzy Prolog based on Lukasiewicz implication and bounded product , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[25]  Thomas B. Moeslund,et al.  Automatic Annotation of Humans in Surveillance Video , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[26]  Monique Thonnat,et al.  Activity Recognition from Video Sequences using Declarative Models , 2000, ECAI.

[27]  Touradj Ebrahimi,et al.  Real-Time Generation of Annotated Video for Surveillance , 2005 .

[28]  Larry S. Davis,et al.  Representation and Recognition of Events in Surveillance Video Using Petri Nets , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[29]  Mubarak Shah,et al.  Monitoring human behavior from video taken in an office environment , 2001, Image Vis. Comput..

[30]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[31]  J. Crowley,et al.  CAVIAR Context Aware Vision using Image-based Active Recognition , 2005 .

[32]  Francois Bremond,et al.  Temporal Constraints for Video Interpretation , 2002 .

[33]  Trevor P Martin,et al.  The implementation of fprolog—a fuzzy prolog interpreter , 1987 .

[34]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..