Automatic Event Detection for Signal-based Surveillance

Signal-based Surveillance systems such as Closed Circuits Televisions (CCTV) have been widely installed in public places. Those systems are normally used to find the events with security interest, and play a significant role in public safety. Though such systems are still heavily reliant on human labour to monitor the captured information, there have been a number of automatic techniques proposed to analysing the data. This article provides an overview of automatic surveillance event detection techniques . Despite it's popularity in research, it is still too challenging a problem to be realised in a real world deployment. The challenges come from not only the detection techniques such as signal processing and machine learning, but also the experimental design with factors such as data collection, evaluation protocols, and ground-truth annotation. Finally, this article propose that multi-disciplinary research is the path towards a solution to this problem.

[1]  Greg Mori,et al.  Discriminative key-component models for interaction detection and recognition , 2015, Comput. Vis. Image Underst..

[2]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Senior Member,et al.  Robust Background Subtraction for Network Surveillance in H . 264 Streaming Video , 2013 .

[4]  Mubarak Shah,et al.  Abnormal crowd behavior detection using social force model , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Shaogang Gong,et al.  Learning Behavioural Context , 2012, International Journal of Computer Vision.

[6]  Junsong Yuan,et al.  Sparse reconstruction cost for abnormal event detection , 2011, CVPR 2011.

[7]  Koichi Shinoda,et al.  TokyoTech+Canon at TRECVID 2011 , 2011, TRECVID.

[8]  S. Satoh,et al.  Human action recognition in crowded surveillance video sequences by using features taken from key-point trajectories , 2011, CVPR 2011 WORKSHOPS.

[9]  Nadia Magnenat-Thalmann,et al.  Fall Detection Based on Body Part Tracking Using a Depth Camera , 2015, IEEE Journal of Biomedical and Health Informatics.

[10]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[11]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  W. Eric L. Grimson,et al.  Trajectory Analysis and Semantic Region Modeling Using Nonparametric Hierarchical Bayesian Models , 2011, International Journal of Computer Vision.

[13]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Martial Hebert,et al.  Volumetric Features for Video Event Detection , 2010, International Journal of Computer Vision.

[15]  Rama Chellappa,et al.  A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video , 2008, IEEE Trans. Multim..

[16]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Zicheng Liu,et al.  Cross-dataset action detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  L. Kratz,et al.  Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Tieniu Tan,et al.  A system for learning statistical motion patterns , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Fei-Fei Li,et al.  Online detection of unusual events in videos via dynamic sparse coding , 2011, CVPR 2011.

[21]  C. A. Bartlett Closed-circuit television in the bell system , 1956, Electrical Engineering.

[22]  VasconcelosNuno,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008 .

[23]  Dahua Lin,et al.  Learning visual flows: A Lie algebraic approach , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Ehud Rivlin,et al.  Surveillance Event Interpretation Using Generalized Stochastic Petri Nets , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[27]  Michael G. Strintzis,et al.  Swarm Intelligence for Detecting Interesting Events in Crowded Environments , 2015, IEEE Transactions on Image Processing.

[28]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[29]  Elisa Ricci,et al.  Earth mover's prototypes: A convex learning approach for discovering activity patterns in dynamic scenes , 2011, CVPR 2011.

[30]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Mubarak Shah,et al.  A Streakline Representation of Flow in Crowded Scenes , 2010, ECCV.

[34]  Xiaogang Wang,et al.  Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Cui Yong Abnormal Event Detection Based on the Multi-Instance Learning , 2011 .

[36]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[37]  Zicheng Liu,et al.  Hierarchical Filtered Motion for Action Recognition in Crowded Videos , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[38]  Jürgen Beyerer,et al.  A user study on anonymization techniques for smart video surveillance , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[39]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[40]  W. Eric L. Grimson,et al.  Modeling and estimating persistent motion with geometric flows , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[42]  Luc Van Gool,et al.  What's going on? Discovering spatio-temporal dependencies in dynamic scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Sridha Sridharan,et al.  Unusual Event Detection in Crowded Scenes Using Bag of LBPs in Spatio-Temporal Patches , 2011, 2011 International Conference on Digital Image Computing: Techniques and Applications.

[44]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Sridha Sridharan,et al.  Real-time video event detection in crowded scenes using MPEG derived features: A multiple instance learning approach , 2014, Pattern Recognit. Lett..

[46]  Augusto Sarti,et al.  Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[47]  Matti Pietikäinen,et al.  Human Activity Recognition Using a Dynamic Texture Based Method , 2008, BMVC.

[48]  Robert B. Fisher,et al.  Hidden Markov Models for Optical Flow Analysis in Crowds , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[49]  Mohan M. Trivedi,et al.  Person Tracking with Audio-Visual Cues Using the Iterative Decoding Framework , 2008, 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance.

[50]  Bo Wang,et al.  Abnormal crowd behavior detection using high-frequency and spatio-temporal features , 2011, Machine Vision and Applications.

[51]  Shwetak N. Patel,et al.  Whole-home gesture recognition using wireless signals , 2013, MobiCom.

[52]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[53]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[54]  Changsheng Li,et al.  Sparse representation for robust abnormality detection in crowded scenes , 2014, Pattern Recognit..

[55]  Sharath Pankanti,et al.  Video surveillance: past, present, and now the future [DSP Forum] , 2013, IEEE Signal Processing Magazine.

[56]  Rama Chellappa,et al.  A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video* , 2008, IEEE Transactions on Multimedia.

[57]  Kuo-Chin Fan,et al.  Motion Flow-Based Video Retrieval , 2007, IEEE Transactions on Multimedia.

[58]  Qingshan Liu,et al.  Abnormal detection using interaction energy potentials , 2011, CVPR 2011.

[59]  Ákos Utasi,et al.  Detection of unusual optical flow patterns by multilevel hidden Markov models , 2010 .

[60]  Hongxun Yao,et al.  Boost sparse coding based abnormal event detection via explicitly applying temporal continuity constraint , 2015, ICIMCS '15.

[61]  Alexander G. Hauptmann,et al.  MoSIFT: Recognizing Human Actions in Surveillance Videos , 2009 .

[62]  Larry S. Davis,et al.  AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video , 2011, AVSS.

[63]  Sridha Sridharan,et al.  Textures of optical flow for real-time anomaly detection in crowds , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[64]  Larry S. Davis,et al.  Representation and Recognition of Events in Surveillance Video Using Petri Nets , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[65]  Wen Gao,et al.  IEEE Standards for Advanced Audio and Video Coding in Emerging Applications , 2014, Computer.

[66]  Mubarak Shah,et al.  Identifying Behaviors in Crowd Scenes Using Stability Analysis for Dynamical Systems , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Sirish L. Shah,et al.  Monitoring Safety of Process Operations Using Industrial Workflows , 2015 .

[68]  Carlo S. Regazzoni,et al.  Bio-inspired relevant interaction modelling in cognitive crowd management , 2015, J. Ambient Intell. Humaniz. Comput..

[69]  D. Forsyth,et al.  Video Event Detection: From Subvolume Localization To Spatio-Temporal Path Search. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[70]  Thomas B. Moeslund,et al.  Detecting road user actions in traffic intersections using RGB and thermal video , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[71]  K. Grauman,et al.  Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[73]  Dimitris N. Metaxas,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2007) Group Behavior from Video: a Data-driven Approach to Crowd Simulation , 2022 .

[74]  Sharath Pankanti,et al.  Temporal Sequence Modeling for Video Event Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  David A. Forsyth,et al.  Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Sridha Sridharan,et al.  Detecting anomalous events at railway level crossings , 2013 .

[77]  Amit K. Roy-Chowdhury,et al.  A Continuous Learning Framework for Activity Recognition Using Deep Hybrid Feature Models , 2015, IEEE Transactions on Multimedia.

[78]  Nicolai Petkov,et al.  Car crashes detection by audio analysis in crowded roads , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[79]  Parth H. Pathak,et al.  Analyzing Shopper's Behavior through WiFi Signals , 2015, WPA@MobiSys.

[80]  Zoran Zivkovic,et al.  Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[81]  Xiaogang Wang,et al.  Random field topic model for semantic region analysis in crowded scenes from tracklets , 2011, CVPR 2011.

[82]  Wen-Hsien Fang,et al.  Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[84]  Andrea Cavallaro,et al.  Event monitoring via local motion abnormality detection in non-linear subspace , 2010, Neurocomputing.

[85]  David C. Hogg,et al.  Learning the Distribution of Object Trajectories for Event Recognition , 1995, BMVC.

[86]  Anil C. Kokaram,et al.  Semantic Event Detection in Sports Through Motion Understanding , 2004, CIVR.

[87]  Christian Bauckhage,et al.  Loveparade 2010: Automatic video analysis of a crowd disaster , 2012, Comput. Vis. Image Underst..

[88]  Bernhard Rinner,et al.  Real time complex event detection for resource-limited multimedia sensor networks , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[89]  R. Venkatesh Babu,et al.  Compressed domain human action recognition in H.264/AVC video streams , 2014, Multimedia Tools and Applications.

[90]  Yunqian Ma,et al.  Event detection using local binary pattern based dynamic textures , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[91]  Tim Fingscheidt,et al.  Towards Acoustic Event Detection for Surveillance in Cars , 2014, ITG Symposium on Speech Communication.

[92]  Sridha Sridharan,et al.  Unusual Scene Detection Using Distributed Behaviour Model and Sparse Representation , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[93]  Takeo Kanade,et al.  Learning scene-specific pedestrian detectors without real data , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  James M. Ferryman,et al.  Abnormal behaviour detection on queue analysis from stereo cameras , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[95]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[96]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[97]  Martin Lojka,et al.  Efficient acoustic detector of gunshots and glass breaking , 2015, Multimedia Tools and Applications.

[98]  Ehud Rivlin,et al.  Propagating Certainty in Petri Nets for Activity Recognition , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[99]  Kim M. Hazelwood,et al.  Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[100]  Wen Gao,et al.  Modeling Background and Segmenting Moving Objects from Compressed Video , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[101]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[102]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[103]  Sridha Sridharan,et al.  Dynamic texture reconstruction from sparse codes for unusual event detection in crowded scenes , 2011, J-MRE '11.

[104]  Rongrong Ji,et al.  Social Attribute-Aware Force Model: Exploiting Richness of Interaction for Abnormal Crowd Detection , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[105]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[106]  Martin Courtney Public eyes get smart , 2011 .

[107]  LeeDar-Shyang Effective Gaussian Mixture Learning for Video Background Subtraction , 2005 .

[108]  Piotr Szwed,et al.  Business Processes in a Distributed Surveillance System Integrated Through Workflow , 2013 .

[109]  Sridha Sridharan,et al.  Activity Analysis in Complicated Scenes Using DFT Coefficients of Particle Trajectories , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[110]  Ming-Ting Sun,et al.  Automatic video activity detection using compressed domain motion trajectories for H.264 videos , 2011, J. Vis. Commun. Image Represent..

[111]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[112]  Nebojsa Jojic,et al.  A Graphical Model for Audiovisual Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[113]  Léon J. M. Rothkrantz,et al.  Automatic Audio-Visual Fusion for Aggression Detection Using Meta-information , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[114]  Ehud Rivlin,et al.  Robust Real-Time Unusual Event Detection using Multiple Fixed-Location Monitors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115]  Ying Wu,et al.  Discriminative subvolume search for efficient action detection , 2009, CVPR.

[116]  Larry H. Matthies,et al.  Real-time detection of moving objects from moving vehicles using dense stereo and optical flow , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[117]  Yonghong Tian,et al.  PKU-NEC @TRECVID2011 SED: Sequence-Based Event Detection in Surveillance Video , 2011, TRECVID.

[118]  Qiang Ji,et al.  Video event recognition with deep hierarchical context model , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[119]  Sridha Sridharan,et al.  An Efficient and Robust System for Multiperson Event Detection in Real-World Indoor Surveillance Scenes , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[120]  Dar-Shyang Lee,et al.  Effective Gaussian mixture learning for video background subtraction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[121]  Chih-Wen Su,et al.  Real-time event detection and its application to surveillance systems , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[122]  Xiaogang Wang,et al.  Slicing Convolutional Neural Network for Crowd Video Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).