Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality

Recent work has shown that deep generative models can assign higher likelihood to out-of-distribution data sets than to their training data [37, 9]. We posit that this phenomenon is caused by a mismatch between the model’s typical set and its areas of high probability density. In-distribution inputs should reside in the former but not necessarily in the latter, as previous work has presumed [6]. To determine whether or not inputs reside in the typical set, we propose a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods. The test is model agnostic and widely applicable, only requiring that the likelihood can be computed or closely approximated. We report experiments showing that our procedure can successfully detect the out-of-distribution sets in several of the challenging cases reported by Nalisnick et al. [37].

[1]  Alexander A. Alemi,et al.  WAIC, but Why? Generative Ensembles for Robust Anomaly Detection , 2018 .

[2]  Oldrich A Vasicek,et al.  A Test for Normality Based on Sample Entropy , 1976 .

[3]  H. Joe Estimation of entropy and other functionals of a multivariate density , 1989 .

[4]  Emanuel Parzen,et al.  Goodness of Fit Tests and Entropy , 1990 .

[5]  Lucas C. Parra,et al.  Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps , 1996, Neural Computation.

[6]  Václav Smídl,et al.  Are generative deep models for novelty detection truly better? , 2018, ArXiv.

[7]  D. Gokhale On entropy-based goodness-of-fit tests , 1983 .

[8]  C. Huber-Carol Goodness-of-Fit Tests and Model Validity , 2012 .

[9]  Robert D. Nowak,et al.  Learning Minimum Volume Sets , 2005, J. Mach. Learn. Res..

[10]  Arthur Gretton,et al.  A Kernel Test of Goodness of Fit , 2016, ICML.

[11]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[12]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[13]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[16]  Michael Lindenbaum,et al.  Learning High-Density Regions for a Generalized Kolmogorov-Smirnov Test in High-Dimensional Data , 2012, NIPS.

[17]  Edward C. van der Meulen,et al.  Entropy-Based Tests of Uniformity , 1981 .

[18]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[19]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[20]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[21]  Roman Vershynin,et al.  High-Dimensional Probability , 2018 .

[22]  James J. Little,et al.  Does Your Model Know the Digit 6 Is Not a Cat? A Less Biased Evaluation of "Outlier" Detectors , 2018, ArXiv.

[23]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[24]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[25]  H. A. Noughabi,et al.  General treatment of goodness-of-fit tests based on Kullback–Leibler information , 2013 .

[26]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[27]  W. Polonik Minimum volume sets and generalized quantile processes , 1997 .

[28]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[29]  Shelby J. Haberman,et al.  A Warning on the Use of Chi-Squared Statistics with Frequency Tables with Small Expected Cell Counts , 1988 .

[30]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[31]  Michael Brady,et al.  Novelty detection for the identification of masses in mammograms , 1995 .

[32]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[33]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[34]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[35]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[36]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[37]  Yin Xia,et al.  A Nonparametric Normality Test for High-dimensional Data , 2019 .

[38]  Przeniyslaw Crzcgorzewski,et al.  Entropy-based goodness-of-fit test for exponentiality , 1999 .

[39]  Kwang-Hyun Cho,et al.  Level sets and minimum volume sets of probability density functions , 2003, Int. J. Approx. Reason..

[40]  David A. Clifton,et al.  Extending the Generalised Pareto Distribution for Novelty Detection in High-Dimensional Spaces , 2013, J. Signal Process. Syst..

[41]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[42]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[43]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[44]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[45]  V. LaRiccia,et al.  Asymptotic Comparison of Cramer-von Mises and Nonparametric Function Estimation Techniques for Testing Goodness-of-Fit , 1992 .

[46]  Shakir Mohamed,et al.  Distribution Matching in Variational Inference , 2018, ArXiv.

[47]  D. Darling,et al.  A Test of Goodness of Fit , 1954 .

[48]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[49]  T. Sager An Iterative Method for Estimating a Multivariate Mode and Isopleth , 1979 .

[50]  Anders Høst-Madsen,et al.  Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data , 2019, Entropy.

[51]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[52]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[53]  Michael Betancourt,et al.  A Conceptual Introduction to Hamiltonian Monte Carlo , 2017, 1701.02434.

[54]  E. Giné,et al.  On the Bootstrap of $U$ and $V$ Statistics , 1992 .

[55]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[56]  A. Martin-Löf On the composition of elementary errors , 1994 .

[57]  Wolfgang Polonik,et al.  Concentration and goodness-of-fit in higher dimensions: (asymptotically) distribution-free methods , 1999 .

[58]  Stephan Cl'emenccon,et al.  Mass Volume Curves and Anomaly Ranking , 2017, 1705.01305.

[59]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[60]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .