Spatio-temporal context analysis within video volumes for anomalous-event detection and localization

In this paper, we propose an anomaly-detection approach applied for video surveillance in crowded scenes. This approach is an unsupervised statistical learning framework based on analysis of spatio-temporal video-volume configuration within video cubes. It learns global activity patterns and local salient behavior patterns via clustering and sparse coding, respectively. Upon the composition-pattern dictionary learned from normal behavior, a sparse reconstruction cost criterion is designed to detect anomalies that occur in video both globally and locally. In addition, a multiple scale analysis is employed for obtaining accurate anomaly localization, considering scale variations of abnormal events. This approach is verified on publically available anomaly-detection datasets and compared with other existing work. The experiment results demonstrate that it not only detects various anomalies more efficiently, but also locates anomalous regions more accurately.

[1]  Sridha Sridharan,et al.  Textures of optical flow for real-time anomaly detection in crowds , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  R. Grossman,et al.  On the Line , 2008 .

[3]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[4]  Jean-Marc Odobez,et al.  Topic models for scene analysis and abnormality detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  Robert B. Fisher,et al.  Semi-supervised Learning for Anomalous Trajectory Detection , 2008, BMVC.

[6]  Shaogang Gong,et al.  Video Behavior Profiling for Anomaly Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Junsong Yuan,et al.  Sparse reconstruction cost for abnormal event detection , 2011, CVPR 2011.

[8]  Chang Liu,et al.  Anomaly detection in surveillance video using motion direction statistics , 2010, 2010 IEEE International Conference on Image Processing.

[9]  Martial Hebert,et al.  Volumetric Features for Video Event Detection , 2010, International Journal of Computer Vision.

[10]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[11]  Dacheng Tao,et al.  Sparse Camera Network for Visual Surveillance -- A Comprehensive Survey , 2013, ArXiv.

[12]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[13]  Simone Calderara,et al.  Detecting anomalies in people's trajectories using spectral graph analysis , 2011, Comput. Vis. Image Underst..

[14]  Soraia Raupp Musse,et al.  Crowd Analysis Using Computer Vision Techniques , 2010, IEEE Signal Processing Magazine.

[15]  Shaogang Gong,et al.  Incremental and adaptive abnormal behaviour detection , 2008, Comput. Vis. Image Underst..

[16]  David A. Clausi,et al.  Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance , 2011, Image Vis. Comput..

[17]  Alberto Del Bimbo,et al.  Multi-scale and real-time non-parametric approach for anomaly detection and localization , 2012, Comput. Vis. Image Underst..

[18]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Gian Luca Foresti,et al.  On-line trajectory clustering for anomalous events detection , 2006, Pattern Recognit. Lett..

[21]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[22]  Lawrence O. Hall,et al.  A Scalable Framework For Segmenting Magnetic Resonance Images , 2009, J. Signal Process. Syst..

[23]  Tao Mei,et al.  Contextual Bag-of-Words for Visual Categorization , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Ehud Rivlin,et al.  Robust Real-Time Unusual Event Detection using Multiple Fixed-Location Monitors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Kristen Grauman,et al.  Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, CVPR.

[27]  Aggelos K. Katsaggelos,et al.  A Dynamic Hierarchical Clustering Method for Trajectory-Based Unusual Video Event Detection , 2009, IEEE Transactions on Image Processing.

[28]  Lars Niklasson,et al.  Finding behavioural anomalies in public areas using video surveillance data , 2008, 2008 11th International Conference on Information Fusion.

[29]  吴新宇,et al.  Hierarchical Activity Discovery within Spatio-temporal Context for Video Anomaly Detection , 2013 .

[30]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[32]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Mohan M. Trivedi,et al.  Trajectory Learning for Activity Understanding: Unsupervised, Multilevel, and Long-Term Adaptive Approach , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Adriana Kovashka,et al.  Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Kejun Wang,et al.  Video-Based Abnormal Human Behavior Recognition—A Review , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[36]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[37]  Stavros J. Perantonis,et al.  Detecting abnormal human behaviour using multiple cameras , 2009, Signal Process..

[38]  Shaogang Gong,et al.  On-the-fly global activity prediction and anomaly detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[39]  Tao Mei,et al.  Near-lossless semantic video summarization and its applications to video analysis , 2013, TOMCCAP.

[40]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[41]  Xudong Zhu,et al.  Human behavior clustering for anomaly detection , 2011, Frontiers of Computer Science in China.

[42]  Aggelos K. Katsaggelos,et al.  Anomalous video event detection using spatiotemporal context , 2011 .

[43]  Andrew Gilbert,et al.  Action Recognition Using Mined Hierarchical Compound Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Brian C. Lovell,et al.  Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture , 2011, CVPR 2011 WORKSHOPS.

[45]  Christophe Rosenberger,et al.  Abnormal events detection based on spatio-temporal co-occurences , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Samy Bengio,et al.  Semi-supervised adapted HMMs for unusual event detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[48]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[49]  Marc Pollefeys,et al.  Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[50]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[51]  Martin D. Levine,et al.  An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions , 2013, Comput. Vis. Image Underst..

[52]  Christophe Rosenberger,et al.  Abnormal events detection based on spatio-temporal co-occurences , 2009, CVPR.

[53]  Michal Irani,et al.  Detecting Irregularities in Images and in Video , 2005, ICCV.