论文信息 - Energy-based Models for Video Anomaly Detection

Energy-based Models for Video Anomaly Detection

Automated detection of abnormalities in data has been studied in research area in recent years because of its diverse applications in practice including video surveillance, industrial damage detection and network intrusion detection. However, building an effective anomaly detection system is a non-trivial task since it requires to tackle challenging issues of the shortage of annotated data, inability of defining anomaly objects explicitly and the expensive cost of feature engineering procedure. Unlike existing appoaches which only partially solve these problems, we develop a unique framework to cope the problems above simultaneously. Instead of hanlding with ambiguous definition of anomaly objects, we propose to work with regular patterns whose unlabeled data is abundant and usually easy to collect in practice. This allows our system to be trained completely in an unsupervised procedure and liberate us from the need for costly data annotation. By learning generative model that capture the normality distribution in data, we can isolate abnormal data points that result in low normality scores (high abnormality scores). Moreover, by leverage on the power of generative networks, i.e. energy-based models, we are also able to learn the feature representation automatically rather than replying on hand-crafted features that have been dominating anomaly detection research over many decades. We demonstrate our proposal on the specific application of video anomaly detection and the experimental results indicate that our method performs better than baselines and are comparable with state-of-the-art methods in many benchmark video anomaly detection datasets.

[1] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[2] Mubarak Shah,et al. Abnormal crowd behavior detection using social force model , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Larry S. Davis,et al. Unsupervised Abnormal Crowd Activity Detection Using Semiparametric Scan Statistic , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[4] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[5] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[6] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[7] Svetha Venkatesh,et al. Effective Anomaly Detection in Sensor Networks Data Streams , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[8] K. Grauman,et al. Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Xing Hu,et al. Video anomaly detection using deep incremental slow feature analysis network , 2016, IET Comput. Vis..

[10] Tijmen Tieleman,et al. Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[11] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.

[12] G. Kane. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[13] Tapani Raiko,et al. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines , 2011, ICML.

[14] Naixue Xiong,et al. Abnormal event detection in crowded scenes based on deep learning , 2016, Multimedia Tools and Applications.

[15] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16] Slim Abdennadher,et al. Enhancing one-class support vector machines for unsupervised anomaly detection , 2013, ODD '13.

[17] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[18] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[19] Alan L. Yuille,et al. The Convergence of Contrastive Divergences , 2004, NIPS.

[20] Zhou Wang,et al. On the Mathematical Properties of the Structural Similarity Index , 2012, IEEE Transactions on Image Processing.

[21] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..

[22] Ruslan Salakhutdinov,et al. Learning Deep Generative Models , 2009 .

[23] Svetha Venkatesh,et al. Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM) , 2015, J. Biomed. Informatics.

[24] Svetha Venkatesh,et al. Activity recognition and abnormality detection with the switching hidden semi-Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25] Nuno Vasconcelos,et al. Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Xiaoqiang Lu,et al. Learning deep event models for crowd anomaly detection , 2017, Neurocomputing.

[27] Ruimin Shen,et al. Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM , 2011, AISTATS.

[28] Honglak Lee,et al. Learning and Selecting Features Jointly with Point-wise Gated Boltzmann Machines , 2013, ICML.

[29] Yoshua Bengio,et al. Reweighted Wake-Sleep , 2014, ICLR.

[30] Peter V. Gehler,et al. The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[31] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Geoffrey E. Hinton,et al. Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images , 2010, AISTATS.

[33] Klaus-Robert Müller,et al. Deep Boltzmann Machines and the Centering Trick , 2012, Neural Networks: Tricks of the Trade.

[34] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[35] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Geoffrey E. Hinton,et al. Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[37] Ruslan Salakhutdinov,et al. Importance Weighted Autoencoders , 2015, ICLR.

[38] VARUN CHANDOLA,et al. Anomaly detection: A survey , 2009, CSUR.

[39] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[40] Jason Tyler Rolfe,et al. Discrete Variational Autoencoders , 2016, ICLR.

[41] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[42] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[43] Junseok Kwon,et al. A unified framework for event summarization and rare event detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[45] Michael S. Lew,et al. Deep learning for visual understanding: A review , 2016, Neurocomputing.

[46] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[47] Thomas Brox,et al. Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Christophe Rosenberger,et al. Abnormal events detection based on spatio-temporal co-occurences , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[50] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[51] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[52] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .

[53] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[54] Nir Friedman,et al. Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[55] Michael I. Jordan,et al. Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[56] Geoffrey E. Hinton,et al. Conditional Restricted Boltzmann Machines for Structured Output Prediction , 2011, UAI.

[57] Yann LeCun,et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[58] Kejun Wang,et al. Video-Based Abnormal Human Behavior Recognition—A Review , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[59] Eric Allender,et al. Circuit Complexity before the Dawn of the New Millennium , 1996, FSTTCS.

[60] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[61] Qingshan Liu,et al. Abnormal detection using interaction energy potentials , 2011, CVPR 2011.

[62] Zhe Gan,et al. Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.

[63] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[64] A. Jefferson Offutt,et al. An Empirical Evaluation , 1994 .

[65] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[66] Mahmood Fathy,et al. Real-time anomaly detection and localization in crowded scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[67] Cewu Lu,et al. Abnormal Event Detection at 150 FPS in MATLAB , 2013, 2013 IEEE International Conference on Computer Vision.

[68] Jonghyun Choi,et al. Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[70] Geoffrey E. Hinton,et al. Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[71] Diederik P. Kingma. Fast Gradient-Based Inference with Continuous Latent Variable Models in Auxiliary Form , 2013, ArXiv.

[72] Geoffrey E. Hinton,et al. 3D Object Recognition with Deep Belief Nets , 2009, NIPS.

[73] Mubarak Shah,et al. Learning object motion patterns for anomaly detection and improved object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[74] L. Kratz,et al. Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[75] Fei-Fei Li,et al. Online detection of unusual events in videos via dynamic sparse coding , 2011, CVPR 2011.

[76] Geoffrey E. Hinton,et al. OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[77] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .

[78] Martin D. Levine,et al. Online Dominant and Anomalous Behavior Detection in Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[79] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.

[80] Geoffrey E. Hinton,et al. Varieties of Helmholtz Machine , 1996, Neural Networks.

[81] Huchuan Lu,et al. Combining motion and appearance cues for anomaly detection , 2016, Pattern Recognit..

[82] Nicolas Le Roux,et al. Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.

[83] J. Håstad. Computational limitations of small-depth circuits , 1987 .

[84] Svetha Venkatesh,et al. Detection of Cross-Channel Anomalies from Multiple Data Channels , 2011, 2011 IEEE 11th International Conference on Data Mining.

[85] Brett J. Borghetti,et al. A Review of Anomaly Detection in Automated Surveillance , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[86] Geoffrey E. Hinton,et al. Massively Parallel Architectures for AI: NETL, Thistle, and Boltzmann Machines , 1983, AAAI.

[87] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[88] Jürgen Schmidhuber,et al. Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.

[89] Svetha Venkatesh,et al. Tensor-Variate Restricted Boltzmann Machines , 2015, AAAI.

[90] Svetha Venkatesh,et al. Learning Parts-based Representations with Nonnegative Restricted Boltzmann Machine , 2013, ACML.

[91] Nicu Sebe,et al. Learning Deep Representations of Appearance and Motion for Anomalous Event Detection , 2015, BMVC.

[92] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[93] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[94] Geoffrey E. Hinton,et al. Generating more realistic images using gated MRF's , 2010, NIPS.

[95] L. Younes. Parametric Inference for imperfectly observed Gibbsian fields , 1989 .