A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video

Recognition of human activities in restricted settings such as airports, parking lots and banks is of significant interest in security and automated surveillance systems. In such settings, data is usually in the form of surveillance videos with wide variation in quality and granularity. Interpretation and identification of human activities requires an activity model that a) is rich enough to handle complex multi-agent interactions, b) is robust to uncertainty in low-level processing and c) can handle ambiguities in the unfolding of activities. We present a computational framework for human activity representation based on Petri nets. We propose an extension-Probabilistic Petri Nets (PPN)-and show how this model is well suited to address each of the above requirements in a wide variety of settings. We then focus on answering two types of questions: (i) what are the minimal sub-videos in which a given activity is identified with a probability above a certain threshold and (ii) for a given video, which activity from a given set occurred with the highest probability? We provide the PPN-MPS algorithm for the first problem, as well as two different algorithms (naive PPN-MPA and PPN-MPA) to solve the second. Our experimental results on a dataset consisting of bank surveillance videos and an unconstrained TSA tarmac surveillance dataset show that our algorithms are both fast and provide high quality results.

[1]  Larry S. Davis,et al.  Representation and Recognition of Events in Surveillance Video Using Petri Nets , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Jitendra Malik,et al.  Automatic Symbolic Traffic Scene Analysis Using Belief Networks , 1994, AAAI.

[4]  Giovanni Chiola,et al.  An introduction to generalized stochastic Petri nets , 1991 .

[5]  Stephen J. Maybank,et al.  Fusion of Multiple Tracking Algorithms for Robust People Tracking , 2002, ECCV.

[6]  Charles Lesire,et al.  Particle Petri Nets for Aircraft Procedure Monitoring Under Uncertainty , 2005, ICATPN.

[7]  François Brémond,et al.  Automatic Video Interpretation: A Novel Algorithm for Temporal Scenario Recognition , 2003, IJCAI.

[8]  V. S. Subrahmanian,et al.  Detecting Stochastically Scheduled Activities in Video , 2007, IJCAI.

[9]  René David,et al.  Petri nets for modeling of dynamic systems: A survey , 1994, Autom..

[10]  James W. Davis,et al.  Real-time recognition of activity using temporal templates , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[11]  Rama Chellappa,et al.  Activity recognition using the dynamics of the configuration of interacting objects , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Shaogang Gong,et al.  Visual Surveillance in a Dynamic and Uncertain World , 1995, Artif. Intell..

[14]  Janette Cardoso,et al.  Possibilistic Petri nets , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[15]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Matthew Brand,et al.  Understanding manipulation in video , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  Irfan A. Essa,et al.  Recognizing multitasked activities from video using stochastic context-free grammar , 2002, AAAI/IAAI.

[19]  P. Holcomb,et al.  In: Understanding Events: How Humans See, Represent, and Act on Events. , 2007 .

[20]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[21]  Jake K. Aggarwal,et al.  Tracking human motion using multiple cameras , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[22]  Rama Chellappa,et al.  Recognition of Multi-Object Events Using Attribute Grammars , 2006, 2006 International Conference on Image Processing.

[23]  Stephen J. Maybank,et al.  The ADVISOR Visual Surveillance System , 2004 .

[24]  Larry S. Davis,et al.  VidMAP: video monitoring of activity with Prolog , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[25]  Michael T. Orchard,et al.  Fast face detection using subspace discriminant wavelet features , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26]  Jin-Fu Chang,et al.  Knowledge Representation Using Fuzzy Petri Nets , 1990, IEEE Trans. Knowl. Data Eng..

[27]  Mohan M. Trivedi,et al.  Detecting Moving Shadows: Algorithms and Evaluation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..