Multi-Channel Generative Framework and Supervised Learning for Anomaly Detection in Surveillance Videos

Recently, most state-of-the-art anomaly detection methods are based on apparent motion and appearance reconstruction networks and use error estimation between generated and real information as detection features. These approaches achieve promising results by only using normal samples for training steps. In this paper, our contributions are two-fold. On the one hand, we propose a flexible multi-channel framework to generate multi-type frame-level features. On the other hand, we study how it is possible to improve the detection performance by supervised learning. The multi-channel framework is based on four Conditional GANs (CGANs) taking various type of appearance and motion information as input and producing prediction information as output. These CGANs provide a better feature space to represent the distinction between normal and abnormal events. Then, the difference between those generative and ground-truth information is encoded by Peak Signal-to-Noise Ratio (PSNR). We propose to classify those features in a classical supervised scenario by building a small training set with some abnormal samples of the original test set of the dataset. The binary Support Vector Machine (SVM) is applied for frame-level anomaly detection. Finally, we use Mask R-CNN as detector to perform object-centric anomaly localization. Our solution is largely evaluated on Avenue, Ped1, Ped2, and ShanghaiTech datasets. Our experiment results demonstrate that PSNR features combined with supervised SVM are better than error maps computed by previous methods. We achieve state-of-the-art performance for frame-level AUC on Ped1 and ShanghaiTech. Especially, for the most challenging Shanghaitech dataset, a supervised training model outperforms up to 9% the state-of-the-art an unsupervised strategy.

[1]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[2]  K. Grauman,et al.  Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Shenghua Gao,et al.  Margin Learning Embedded Prediction for Video Anomaly Detection with A Few Anomalies , 2019, IJCAI.

[4]  Mubarak Shah,et al.  Real-World Anomaly Detection in Surveillance Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Chang-Tsun Li,et al.  The LV dataset: A realistic surveillance video dataset for abnormal event detection , 2017, 2017 5th International Workshop on Biometrics and Forensics (IWBF).

[7]  Hichem Snoussi,et al.  Histograms of Optical Flow Orientation for Visual Abnormal Events Detection , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[8]  Shenghua Gao,et al.  A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  S. Li,et al.  Learning semantic scene models by object classification and trajectory clustering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Victor Sanchez,et al.  Video Anomaly Detection by Estimating Likelihood of Representations , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[11]  Martial Hebert,et al.  A Discriminative Framework for Anomaly Detection in Large Videos , 2016, ECCV.

[12]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[13]  Tal Hassner,et al.  Violent flows: Real-time detection of violent crowd behavior , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Wei Luo,et al.  Robust Anomaly Detection in Videos Using Multilevel Representations , 2019, AAAI.

[15]  Ling Shao,et al.  Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Svetha Venkatesh,et al.  Energy-Based Localized Anomaly Detection in Video Surveillance , 2017, PAKDD.

[18]  Jean Meunier,et al.  Anomaly Detection in Video Sequence With Appearance-Motion Correspondence , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Shenghua Gao,et al.  Future Frame Prediction for Anomaly Detection - A New Baseline , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Jonghyun Choi,et al.  Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Radu Tudor Ionescu,et al.  Unmasking the Abnormal Events in Video , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Tao Mei,et al.  Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Cewu Lu,et al.  Abnormal Event Detection at 150 FPS in MATLAB , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Vladlen Koltun,et al.  Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[28]  Shenghua Gao,et al.  Remembering history with convolutional LSTM for anomaly detection , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).