Scene-Aware Context Reasoning for Unsupervised Abnormal Event Detection in Videos

In this paper, we propose a scene-aware context reasoning method that exploits context information from visual features for unsupervised abnormal event detection in videos, which bridges the semantic gap between visual context and the meaning of abnormal events. In particular, we build na spatio-temporal context graph to model visual context information including appearances of objects, spatio-temporal relationships among objects and scene types. The context information is encoded into the nodes and edges of the graph, and their states are iteratively updated by using multiple RNNs with message passing for context reasoning. To infer the spatio-temporal context graph in various scenes, we develop a graph-based deep Gaussian mixture model for scene clustering in an unsupervised manner. We then compute frame-level anomaly scores based on the context information to discriminate abnormal events in various scenes. Evaluations on three challenging datasets, including the UCF-Crime, Avenue, and ShanghaiTech datasets, demonstrate the effectiveness of our method.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  Yu Qiao,et al.  AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network , 2019, ACM Multimedia.

[3]  Mubarak Shah,et al.  Real-World Anomaly Detection in Surveillance Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Yunde Jia,et al.  Learning Weighted Video Segments for Temporal Action Localization , 2019, PRCV.

[5]  Amit K. Roy-Chowdhury,et al.  Context-Aware Activity Recognition and Anomaly Detection in Video , 2013, IEEE Journal of Selected Topics in Signal Processing.

[6]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[7]  Nicu Sebe,et al.  Detecting anomalous events in videos by learning deep representations of appearance and motion , 2017, Comput. Vis. Image Underst..

[8]  Yong Haur Tay,et al.  Abnormal Event Detection in Videos using Spatiotemporal Autoencoder , 2017, ISNN.

[9]  Cewu Lu,et al.  Abnormal Event Detection at 150 FPS in MATLAB , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Svetha Venkatesh,et al.  Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Amit K. Roy-Chowdhury,et al.  Context-Aware Query Selection for Active Learning in Event Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mahmood Fathy,et al.  Deep-Cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes , 2017, IEEE Transactions on Image Processing.

[14]  Tao Chang,et al.  Context-Interactive CNN for Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[15]  Michael J. V. Leach,et al.  Contextual anomaly detection in crowded surveillance scenes , 2014, Pattern Recognit. Lett..

[16]  Jonghyun Choi,et al.  Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kun Liu,et al.  Exploring Background-bias for Anomaly Detection in Surveillance Videos , 2019, ACM Multimedia.

[18]  Antonio Torralba,et al.  Context models and out-of-context objects , 2012, Pattern Recognit. Lett..

[19]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wenjun Zeng,et al.  Predicting Future Instance Segmentation with Contextual Pyramid ConvLSTMs , 2019, ACM Multimedia.

[21]  Wei Liu,et al.  Learning to Compose Dynamic Tree Structures for Visual Contexts , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Luc Van Gool,et al.  stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Rae-Hong Park,et al.  Context-based abnormal object detection using the fully-connected conditional random fields , 2017, Pattern Recognit. Lett..

[26]  Andreas E. Savakis,et al.  Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks , 2016, ArXiv.

[27]  S. L. Netto,et al.  Domain-Transformable Sparse Representation for Anomaly Detection in Moving-Camera Videos , 2020, IEEE Transactions on Image Processing.

[28]  Xiaoqiang Lu,et al.  Deep Representation for Abnormal Event Detection in Crowded Scenes , 2016, ACM Multimedia.

[29]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[30]  Sangdon Park,et al.  Abnormal Object Detection by Canonical Scene-Based Contextual Model , 2012, ECCV.

[31]  Qiang Liu,et al.  Detecting Abnormality without Knowing Normality: A Two-stage Approach for Unsupervised Video Abnormal Event Detection , 2018, ACM Multimedia.

[32]  Ke Xu,et al.  Video Anomaly Detection and Localization Based on an Adaptive Intra-Frame Classification Network , 2020, IEEE Transactions on Multimedia.

[33]  Svetha Venkatesh,et al.  Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Radu Tudor Ionescu,et al.  Unmasking the Abnormal Events in Video , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Chunhua Shen,et al.  Self-Trained Deep Ordinal Regression for End-to-End Video Anomaly Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[37]  Mei Chen,et al.  Learning Normal Patterns via Adversarial Attention-Based Autoencoder for Abnormal Event Detection in Videos , 2020, IEEE Transactions on Multimedia.

[38]  Shenghua Gao,et al.  Future Frame Prediction for Anomaly Detection - A New Baseline , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Shenghua Gao,et al.  A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[41]  Ling Shao,et al.  Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).