Modeling the Distribution of Normal Data in Pre-Trained Deep Features for Anomaly Detection

Anomaly Detection (AD) in images is a fundamental computer vision problem and refers to identifying images and/or image substructures that deviate significantly from the norm. Popular AD algorithms commonly try to learn a model of normality from scratch using task specific datasets, but are limited to semi-supervised approaches employing mostly normal data due to the inaccessibility of anomalies on a large scale combined with the ambiguous nature of anomaly appearance. We follow an alternative approach and demonstrate that deep feature representations learned by discriminative models on large natural image datasets are well suited to describe normality and detect even subtle anomalies in a transfer learning setting. Our model of normality is established by fitting a multivariate Gaussian (MVG) to deep feature representations of classification networks trained on ImageNet using normal data only. By subsequently applying the Mahalanobis distance as the anomaly score we outperform the current state of the art on the public MVTec AD dataset, achieving an Area Under the Receiver Operating Characteristic curve of 95.8 ± 1.2% (mean ± SEM) over all 15 classes. We further investigate why the learned representations are discriminative to the AD task using Principal Component Analysis. We find that the principal components containing little variance in normal data are the ones crucial for discriminating between normal and anomalous instances. This gives a possible explanation to the often subpar performance of AD approaches trained from scratch using normal data only. By selectively fitting a MVG to these most relevant components only, we are able to further reduce model complexity while retaining AD performance. We also investigate setting the working point by selecting acceptable False Positive Rate thresholds based on the MVG assumption. Code is publicly available at https://github.com/ORippler/gaussian-ad-mvtec.

[1]  Daniel Cremers,et al.  q-Space Novelty Detection with Variational Autoencoders , 2018, Computational Diffusion MRI.

[2]  Hongxia Jin,et al.  Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Klaus-Robert Müller,et al.  Feature Extraction for One-Class Classification , 2003, ICANN.

[4]  Rodrigo Fernandes de Mello,et al.  Are pre-trained CNNs good feature extractors for anomaly detection in surveillance videos? , 2018, ArXiv.

[5]  Jesse Davis,et al.  Fast Distance-Based Anomaly Detection in Images Using an Inception-Like Autoencoder , 2019, DS.

[6]  Matthias Haselmann,et al.  Anomaly Detection Using Deep Learning Based Image Completion , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[7]  C. Steger,et al.  Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Aaron C. Courville,et al.  Detecting semantic anomalies , 2019, AAAI.

[9]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[10]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[13]  Svetha Venkatesh,et al.  Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Carsten Steger,et al.  MVTec AD — A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Mahmood Fathy,et al.  Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes , 2016, Comput. Vis. Image Underst..

[16]  Carsten Steger,et al.  Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders , 2018, VISIGRAPP.

[17]  Ran El-Yaniv,et al.  Deep Anomaly Detection Using Geometric Transformations , 2018, NeurIPS.

[18]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[19]  Georg Langs,et al.  f‐AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks , 2019, Medical Image Anal..

[20]  Cewu Lu,et al.  Inverse-Transform AutoEncoder for Anomaly Detection , 2019, ArXiv.

[21]  Thomas G. Dietterich,et al.  Open Category Detection with PAC Guarantees , 2018, ICML.

[22]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[23]  Ling Guan,et al.  Covariance-guided One-Class Support Vector Machine , 2014, Pattern Recognit..

[24]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[25]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[26]  Peter Christiansen,et al.  DeepAnomaly: Combining Background Subtraction and Deep Learning for Detecting Obstacles and Anomalies in an Agricultural Field , 2016, Sensors.

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[29]  Stanislav Pidhorskyi,et al.  Generative Probabilistic Novelty Detection with Adversarial Autoencoders , 2018, NeurIPS.

[30]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[33]  Yedid Hoshen,et al.  Deep Nearest Neighbor Anomaly Detection , 2020, ArXiv.

[34]  Simone Calderara,et al.  Latent Space Autoregression for Novelty Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Peter A. Flach,et al.  A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance , 2011, ICML.

[36]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[37]  Cewu Lu,et al.  Attribute Restoration Framework for Anomaly Detection , 2019, IEEE Transactions on Multimedia.

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Toby P. Breckon,et al.  GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training , 2018, ACCV.

[40]  Yedid Hoshen,et al.  Transformer-Based Anomaly Segmentation , 2020 .

[41]  Paolo Napoletano,et al.  Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity , 2018, Sensors.

[42]  Lewis D. Griffin,et al.  Transfer representation-learning for anomaly detection , 2016, ICML 2016.

[43]  Alexander Binder,et al.  Deep Semi-Supervised Anomaly Detection , 2019, ICLR.

[44]  Christian Ledig,et al.  Is the deconvolution layer the same as a convolutional layer? , 2016, ArXiv.

[45]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .