Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection

Despite their successes, deep neural networks may make unreliable predictions when faced with test data drawn from a distribution different to that of the training data, constituting a major problem for AI safety. While this has recently motivated the development of methods to detect such out-of-distribution (OoD) inputs, a robust solution is still lacking. We propose a new probabilistic, unsupervised approach to this problem based on a Bayesian variational autoencoder model, which estimates a full posterior distribution over the decoder parameters using stochastic gradient Markov chain Monte Carlo, instead of fitting a point estimate. We describe how information-theoretic measures based on this posterior can then be used to detect OoD inputs both in input space and in the model's latent space. We empirically demonstrate the effectiveness of our proposed approach.

[1]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[2]  José Miguel Hernández-Lobato,et al.  A COLD Approach to Generating Optimal Samples , 2019, ArXiv.

[3]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[4]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[5]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[6]  Mingyuan Zhou,et al.  Semi-Implicit Variational Inference , 2018, ICML.

[7]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[8]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[9]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[10]  Justin Bayer,et al.  Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series , 2016, ArXiv.

[11]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[12]  Yee Whye Teh,et al.  Detecting Out-of-Distribution Inputs to Deep Generative Models Using a Test for Typicality , 2019, ArXiv.

[13]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[14]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[15]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[16]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[17]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[18]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[19]  Chuan Sheng Foo,et al.  Efficient GAN-Based Anomaly Detection , 2018, ArXiv.

[20]  Neil D. Lawrence,et al.  Structured Variationally Auto-encoded Optimization , 2018, ICML.

[21]  Jakub M. Tomczak,et al.  Combinatorial Bayesian Optimization using the Graph Cartesian Product , 2019, NeurIPS.

[22]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[23]  Matthias Poloczek,et al.  Bayesian Optimization of Combinatorial Structures , 2018, ICML.

[24]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[25]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[26]  David M. Blei,et al.  Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..

[27]  James J. Little,et al.  Does Your Model Know the Digit 6 Is Not a Cat? A Less Biased Evaluation of "Outlier" Detectors , 2018, ArXiv.

[28]  Jennifer Listgarten,et al.  Conditioning by adaptive sampling for robust design , 2019, ICML.

[29]  Aaron C. Courville,et al.  Detecting semantic anomalies , 2019, AAAI.

[30]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[31]  Sebastian Nowozin,et al.  Icebreaker: Element-wise Active Information Acquisition with Bayesian Deep Latent Gaussian Model , 2019, ArXiv.

[32]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[33]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[34]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[35]  Max Welling,et al.  Combinatorial Bayesian Optimization using Graph Representations , 2019, ArXiv.

[36]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[37]  Alexander A. Alemi,et al.  Uncertainty in the Variational Information Bottleneck , 2018, ArXiv.

[38]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[39]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[41]  Kumar Sricharan,et al.  Building robust classifiers through generation of confident out of distribution examples , 2018, ArXiv.

[42]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[43]  Aaron Klein,et al.  Bayesian Optimization with Robust Bayesian Neural Networks , 2016, NIPS.

[44]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[45]  Andreas Krause,et al.  Mixed-Variable Bayesian Optimization , 2020, IJCAI.

[46]  Andrew Gordon Wilson,et al.  Bayesian GAN , 2017, NIPS.

[47]  Alexander A. Alemi,et al.  WAIC, but Why? Generative Ensembles for Robust Anomaly Detection , 2018 .

[48]  Jos'e Miguel Hern'andez-Lobato,et al.  Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining , 2020, NeurIPS.

[49]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[50]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[51]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[52]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[53]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[54]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[55]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[56]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[57]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[58]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[59]  Seungjin Choi,et al.  Bayesian Optimization over Sets , 2019, ArXiv.

[60]  Andrew Gordon Wilson,et al.  A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.

[61]  Sumio Watanabe,et al.  Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory , 2010, J. Mach. Learn. Res..

[62]  Eric Jang,et al.  Generative Ensembles for Robust Anomaly Detection , 2018, ArXiv.

[63]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[64]  Michael Betancourt,et al.  A Conceptual Introduction to Hamiltonian Monte Carlo , 2017, 1701.02434.

[65]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[66]  Yee Whye Teh,et al.  Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality , 2019 .

[67]  Mohammad Emtiyaz Khan,et al.  Practical Deep Learning with Bayesian Principles , 2019, NeurIPS.

[68]  Toby P. Breckon,et al.  GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training , 2018, ACCV.

[69]  Mykel J. Kochenderfer,et al.  Amortized Inference Regularization , 2018, NeurIPS.

[70]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[71]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[72]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[73]  Jos'e Miguel Hern'andez-Lobato,et al.  Constrained Bayesian Optimization for Automatic Chemical Design , 2017 .

[74]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[75]  Tianqi Chen,et al.  Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[76]  Artem Cherkasov,et al.  All SMILES Variational Autoencoder , 2019, 1905.13343.

[77]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[78]  Luca Martino,et al.  Effective sample size for importance sampling based on discrepancy measures , 2016, Signal Process..