Future Frame Prediction Using Convolutional VRNN for Anomaly Detection

Anomaly detection in videos aims at reporting anything that does not conform the normal behaviour or distribution. However, due to the sparsity of abnormal video clips in real life, collecting annotated data for supervised learning is exceptionally cumbersome. Inspired by the practicability of generative models for semi-supervised learning, we propose a novel sequential generative model based on variational autoencoder (VAE) for future frame prediction with convolutional LSTM (ConvLSTM). To the best of our knowledge, this is the first work that considers temporal information in future frame prediction based anomaly detection framework from the model perspective. Our experiments demonstrate that our approach is superior to the state-of-the-art methods on three benchmark datasets.

[1]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Nicu Sebe,et al.  Detecting anomalous events in videos by learning deep representations of appearance and motion , 2017, Comput. Vis. Image Underst..

[4]  Sanjay Chawla,et al.  Robust, Deep and Inductive Anomaly Detection , 2017, ECML/PKDD.

[5]  Cewu Lu,et al.  Abnormal Event Detection at 150 FPS in MATLAB , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Andreas E. Savakis,et al.  Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks , 2016, ArXiv.

[7]  Shenghua Gao,et al.  A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Mahmood Fathy,et al.  Video anomaly detection and localisation based on the sparsity and reconstruction error of auto-encoder , 2016 .

[9]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[10]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[11]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  David A. Clausi,et al.  Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance , 2011, Image Vis. Comput..

[15]  K. Grauman,et al.  Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Olof Mogren,et al.  C-RNN-GAN: Continuous recurrent neural networks with adversarial training , 2016, ArXiv.

[20]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[21]  Fei-Fei Li,et al.  Online detection of unusual events in videos via dynamic sparse coding , 2011, CVPR 2011.

[22]  Kristen Grauman,et al.  Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, CVPR.

[23]  Mahmood Fathy,et al.  Adversarially Learned One-Class Classifier for Novelty Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Martial Hebert,et al.  A Discriminative Framework for Anomaly Detection in Large Videos , 2016, ECCV.

[26]  Junsong Yuan,et al.  Sparse reconstruction cost for abnormal event detection , 2011, CVPR 2011.

[27]  Shenghua Gao,et al.  Remembering history with convolutional LSTM for anomaly detection , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[28]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[29]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[30]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[31]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[32]  Jonghyun Choi,et al.  Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Shenghua Gao,et al.  Future Frame Prediction for Anomaly Detection - A New Baseline , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  D. Pollard Asymptotics for Least Absolute Deviation Regression Estimators , 1991, Econometric Theory.