Modeling the Distribution of Normal Data in Pre-Trained Deep Features for Anomaly Detection

Anomaly Detection (AD) in images is a fundamental computer vision problem and refers to identifying images and/or image substructures that deviate significantly from the norm. Popular AD algorithms commonly try to learn a model of normality from scratch using task specific datasets, but are limited to semi-supervised approaches employing mostly normal data due to the inaccessibility of anomalies on a large scale combined with the ambiguous nature of anomaly appearance. We follow an alternative approach and demonstrate that deep feature representations learned by discriminative models on large natural image datasets are well suited to describe normality and detect even subtle anomalies. Our model of normality is established by fitting a multivariate Gaussian to deep feature representations of classification networks trained on ImageNet using normal data only in a transfer learning setting. By subsequently applying the Mahalanobis distance as the anomaly score we outperform the current state of the art on the public MVTec AD dataset, achieving an Area Under the Receiver Operating Characteristic curve of $95.8 \pm 1.2$ (mean $\pm$ SEM) over all 15 classes. We further investigate why the learned representations are discriminative to the AD task using Principal Component Analysis. We find that the principal components containing little variance in normal data are the ones crucial for discriminating between normal and anomalous instances. This gives a possible explanation to the often sub-par performance of AD approaches trained from scratch using normal data only. By selectively fitting a multivariate Gaussian to these most relevant components only we are able to further reduce model complexity while retaining AD performance. We also investigate setting the working point by selecting acceptable False Positive Rate thresholds based on the multivariate Gaussian assumption.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jesse Davis,et al.  Fast Distance-Based Anomaly Detection in Images Using an Inception-Like Autoencoder , 2019, DS.

[4]  Hongxia Jin,et al.  Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Matthias Haselmann,et al.  Anomaly Detection Using Deep Learning Based Image Completion , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[6]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[7]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[8]  Toby P. Breckon,et al.  GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training , 2018, ACCV.

[9]  Georg Langs,et al.  f‐AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks , 2019, Medical Image Anal..

[10]  Simone Calderara,et al.  Latent Space Autoregression for Novelty Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[12]  Cewu Lu,et al.  Inverse-Transform AutoEncoder for Anomaly Detection , 2019, ArXiv.

[13]  Mahmood Fathy,et al.  Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes , 2016, Comput. Vis. Image Underst..

[14]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[15]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[16]  Paolo Napoletano,et al.  Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity , 2018, Sensors.

[17]  Christian Ledig,et al.  Is the deconvolution layer the same as a convolutional layer? , 2016, ArXiv.

[18]  Carsten Steger,et al.  MVTec AD — A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[21]  Daniel Cremers,et al.  q-Space Novelty Detection with Variational Autoencoders , 2018, Computational Diffusion MRI.

[22]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[23]  Cewu Lu,et al.  Attribute Restoration Framework for Anomaly Detection , 2019, IEEE Transactions on Multimedia.

[24]  Yedid Hoshen,et al.  Sub-Image Anomaly Detection with Deep Pyramid Correspondences , 2020, ArXiv.

[25]  Paul Bergmann,et al.  Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Thomas G. Dietterich,et al.  Open Category Detection with PAC Guarantees , 2018, ICML.

[27]  Ling Guan,et al.  Covariance-guided One-Class Support Vector Machine , 2014, Pattern Recognit..

[28]  Klaus-Robert Müller,et al.  Feature Extraction for One-Class Classification , 2003, ICANN.

[29]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[30]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[31]  Yedid Hoshen,et al.  Deep Nearest Neighbor Anomaly Detection , 2020, ArXiv.

[32]  Carsten Steger,et al.  Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders , 2018, VISIGRAPP.

[33]  Peter Christiansen,et al.  DeepAnomaly: Combining Background Subtraction and Deep Learning for Detecting Obstacles and Anomalies in an Agricultural Field , 2016, Sensors.

[34]  Lewis D. Griffin,et al.  Transfer representation-learning for anomaly detection , 2016, ICML 2016.

[35]  Alexander Binder,et al.  Deep Semi-Supervised Anomaly Detection , 2019, ICLR.

[36]  Svetha Venkatesh,et al.  Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Rodrigo Fernandes de Mello,et al.  Are pre-trained CNNs good feature extractors for anomaly detection in surveillance videos? , 2018, ArXiv.

[38]  Ran El-Yaniv,et al.  Deep Anomaly Detection Using Geometric Transformations , 2018, NeurIPS.

[39]  Stanislav Pidhorskyi,et al.  Generative Probabilistic Novelty Detection with Adversarial Autoencoders , 2018, NeurIPS.

[40]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Thanard Kurutach,et al.  Deep Variational Semi-Supervised Novelty Detection , 2019, ArXiv.

[43]  Aaron C. Courville,et al.  Detecting semantic anomalies , 2019, AAAI.

[44]  Peter A. Flach,et al.  A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance , 2011, ICML.

[45]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[46]  P. Mahalanobis On the generalized distance in statistics , 1936 .